“Hyperparameter optimization in black-box image processing using differentiable proxies” by Tseng, Yu, Yang, Mannan, Arnaud, et al. …

  • ©Ethan Tseng, Felix Yu, Yuting Yang, Fahim Mannan, Karl St. Arnaud, Derek Nowrouzezahrai, Jean-Francois Lalonde, and Felix Heide



Session Title:

    Image Science


    Hyperparameter optimization in black-box image processing using differentiable proxies



    Nearly every commodity imaging system we directly interact with, or indirectly rely on, leverages power efficient, application-adjustable black-box hardware image signal processing (ISPs) units, running either in dedicated hardware blocks, or as proprietary software modules on programmable hardware. The configuration parameters of these black-box ISPs often have complex interactions with the output image, and must be adjusted prior to deployment according to application-specific quality and performance metrics. Today, this search is commonly performed manually by “golden eye” experts or algorithm developers leveraging domain expertise. We present a fully automatic system to optimize the parameters of black-box hardware and software image processing pipelines according to any arbitrary (i.e., application-specific) metric. We leverage a differentiable mapping between the configuration space and evaluation metrics, parameterized by a convolutional neural network that we train in an end-to-end fashion with imaging hardware in-the-loop. Unlike prior art, our differentiable proxies allow for high-dimension parameter search with stochastic first-order optimizers, without explicitly modeling any lower-level image processing transformations. As such, we can efficiently optimize black-box image processing pipelines for a variety of imaging applications, reducing application-specific configuration times from months to hours. Our optimization method is fully automatic, even with black-box hardware in the loop. We validate our method on experimental data for real-time display applications, object detection, and extreme low-light imaging. The proposed approach outperforms manual search qualitatively and quantitatively for all domain-specific applications tested. When applied to traditional denoisers, we demonstrate that—just by changing hyperparameters—traditional algorithms can outperform recent deep learning methods by a substantial margin on recent benchmarks.


    1. Abdelrahman Abdelhamed, Stephen Lin, and Michael S Brown. 2018. A High-Quality Denoising Dataset for Smartphone Cameras. In IEEE Conference on Computer Vision and Pattern Recognition. 1692–1700.Google ScholarCross Ref
    2. D. Ackley. 2012. A Connectionist Machine for Genetic Hillclimbing. Springer US.Google Scholar
    3. Michal Aharon, Michael Elad, Alfred Bruckstein, et al. 2006. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on signal processing 54, 11 (2006), 4311. Google ScholarDigital Library
    4. Brendan Barry, Cormac Brick, Fergal Connor, David Donohoe, David Moloney, Richard Richmond, Martin O’Riordan, and Vasile Toma. 2015. Always-on vision processing unit for mobile applications. IEEE Micro 35, 2 (2015), 56–66.Google ScholarDigital Library
    5. Donald Baxter, Frederic Cao, Henrik Eliasson, and Jonathan Phillips. 2012. Development of the I3A CPIQ spatial metrics. Proc.SPIE 8293.Google ScholarCross Ref
    6. James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, Feb (2012), 281–305. Google ScholarDigital Library
    7. James Bergstra, Dan Yamins, and David D Cox. 2013. Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. In 12th Python in Science Conference. Citeseer, 13–20.Google ScholarCross Ref
    8. A. Buades, B. Coll, and J.-M. Morel. 2005. A non-local algorithm for image denoising. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. 60–65. Google ScholarDigital Library
    9. Harold Burger, Christian Schuler, and Stefan Harmeling. 2012. Image denoising: Can plain neural networks compete with BM3D?. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
    10. Chakravarty R. Alla Chaitanya, Anton S. Kaplanyan, Christoph Schied, Marco Salvi, Aaron Lefohn, Derek Nowrouzezahrai, and Timo Aila. 2017. Interactive Reconstruction of Monte Carlo Image Sequences Using a Recurrent Denoising Autoencoder. ACM Trans. Graph. 36, 4 (July 2017). Google ScholarDigital Library
    11. C. Chen, Q. Chen, J. Xu, and V. Koltun. 2018. Learning to See in the Dark. ArXiv e-prints (May 2018). arXiv:1805.01934Google Scholar
    12. Q. Chen, J. Xu, and V. Koltun. 2017. Fast Image Processing with Fully-Convolutional Networks. In 2017 IEEE International Conference on Computer Vision (ICCV). 2516–2525.Google Scholar
    13. Yunjin Chen and Thomas Pock. 2017. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE transactions on pattern analysis and machine intelligence 39, 6 (2017), 1256–1272. Google ScholarDigital Library
    14. J. Choi, S. Jang, S. Lee, Y. Hwang, and B. H. Choi. 2014. Memory optimization of bilateral filter and its hardware implementation. In The 18th IEEE International Symposium on Consumer Electronics (ISCE 2014). 1–2.Google ScholarCross Ref
    15. K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. 2007. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Processing 16, 8 (2007). Google ScholarDigital Library
    16. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In IEEE Conf. on Computer Vision and Pattern Recognition. 248–255.Google ScholarCross Ref
    17. Michael Elad and Michal Aharon. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image processing 15, 12 (2006), 3736–3745. Google ScholarDigital Library
    18. Qingnan Fan, Jiaolong Yang, David Wipf, Baoquan Chen, and Xin Tong. 2018. Image Smoothing via Unsupervised Learning. ACM Transactions on Graphics (Proceedings of SIGGRAPH ASIA 2018) 37, 6 (2018). Google ScholarDigital Library
    19. Kenneth Garrard, Thomas Bruegge, Jeff Hoffman, Thomas Dow, and Alex Sohn. 2005. Design tools for freeform optics. In Current Developments in Lens Design and Optical Engineering VI, Vol. 5874. International Society for Optics and Photonics, 58740A.Google ScholarCross Ref
    20. Carl Friedrich Gauss. 1843. Dioptrische Untersuchungen von CF Gauss. in der Dieterichschen Buchhandlung.Google Scholar
    21. Joseph M Geary. 2002. Introduction to lens design: with practical ZEMAX examples. Willmann-Bell Richmond.Google Scholar
    22. Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research 32, 11 (2013), 1231–1237. Google ScholarDigital Library
    23. M. Gharbi, G. Chaurasia, S. Paris, and F. Durand. 2016. Deep joint demosaicking and denoising. ACM Transactions on Graphics (TOG) 35, 6 (2016), 191. Google ScholarDigital Library
    24. M. Gharbi, J. Chen, J. Barron, S. Hasinoff, and F. Durand. 2017. Deep Bilateral Learning for Real-Time Image Enhancement. ACM Trans. Graph. (SIGGRAPH) (2017). Google ScholarDigital Library
    25. Radek Grzeszczuk, Demetri Terzopoulos, and Geoffrey Hinton. 1998. NeuroAnimator: Fast Neural Network Emulation and Control of Physics-based Models. In Proc. of the 25th Annual Conf. on Computer Graphics and Interactive Techniques (SIGGRAPH). ACM. Google ScholarDigital Library
    26. Shuhang Gu, Lei Zhang, Wangmeng Zuo, and Xiangchu Feng. 2014. Weighted nuclear norm minimization with application to image denoising. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
    27. Shi Guo, Zifei Yan, Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018. Toward convolutional blind denoising of real photographs. arXiv preprint arXiv:1807.04686 (2018).Google Scholar
    28. Mohit Gupta, Amit Agrawal, Ashok Veeraraghavan, and Srinivasa G Narasimhan. 2011. Structured light 3D scanning in the presence of global illumination. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 713–720. Google ScholarDigital Library
    29. Nikolaus Hansen, Sibylle D Müller, and Petros Koumoutsakos. 2003. Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary computation 11, 1 (2003), 1–18. Google ScholarDigital Library
    30. S. Hasinoff, D. Sharlet, R. Geiss, A. Adams, J. Barron, F. Kainz, J. Chen, and M. Levoy. 2016. Burst Photography for High Dynamic Range and Low-light Imaging on Mobile Cameras. ACM Trans. Graph. 35, 6, Article 192 (2016), 12 pages. Google ScholarDigital Library
    31. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
    32. F. Heide, M. Steinberger, Y.-T. Tsai, M. Rouf, D. Pajak, D. Reddy, O. Gallo, J. Liu, W. Heidrich, K. Egiazarian, J. Kautz, and K. Pulli. 2014. FlexISP: A flexible camera image processing framework. ACM Trans. Graph. (SIGGRAPH Asia) 33, 6 (2014). Google ScholarDigital Library
    33. ISO. {n. d.}a. ISO 1858. https://standards.ieee.org/standard/1858-2016.html. ({n. d.}). {Online; accessed 5-January-2019}.Google Scholar
    34. ISO. {n. d.}b. ISO 71696. https://www.iso.org/standard/71696.htm. ({n. d.}). {Online; accessed 5-January-2019}.Google Scholar
    35. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
    36. Momin Jamil and Xin-She Yang. 2013. A Literature Survey of Benchmark Functions For Global Optimization Problems. CoRR abs/1308.4008 (2013). arXiv:1308.4008 http://arxiv.org/abs/1308.4008Google Scholar
    37. Norman Koren. 2006. The Imatest program: comparing cameras with different amounts of sharpening. In Digital Photography II, Vol. 6069. International Society for Optics and Photonics, 60690L.Google ScholarCross Ref
    38. Tzu-Mao Li, Miika Aittala, Frédo Durand, and Jaakko Lehtinen. 2018. Differentiable Monte Carlo Ray Tracing through Edge Sampling. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 37, 6 (2018), 222:1–222:11. Google ScholarDigital Library
    39. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.Google ScholarCross Ref
    40. Hsueh-Ti Derek Liu, Michael Tao, Chun-Liang Li, Derek Nowrouzezahrai, and Alec Jacobson. 2019. Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer. In International Conference on Learning Representations.Google Scholar
    41. Ilya Loshchilov, Tobias Glasmachers, and Hans-Georg Beyer. 2017. Limited-Memory Matrix Adaptation for Large Scale Black-box Optimization. CoRR abs/1705.06693 (2017). http://arxiv.org/abs/1705.06693Google Scholar
    42. Daniel Malacara-Hernández and Zacarías Malacara-Hernández. 2016. Handbook of optical design. CRC Press.Google Scholar
    43. Ruben Martinez-Cantin. 2014. Bayesopt: A bayesian optimization library for nonlinear optimization, experimental design and bandits. The Journal of Machine Learning Research 15, 1 (2014), 3735–3739. Google ScholarDigital Library
    44. ON Semi MT9P111. 2015. MT9P111: 1/4-Inch 5 Mp System-On-A-Chip (SOC) CMOS Digital Image Sensor. http://www.onsemi.com/pub/Collateral/MT9P111-D.PDF. (2015).Google Scholar
    45. John A Nelder and Roger Mead. 1965. A simplex method for function minimization. The computer journal 7, 4 (1965), 308–313.Google Scholar
    46. J. Nishimura, T. Gerasimow, R. Sushma, A. Sutic, C. Wu, and G. Michael. 2018. Automatic ISP Image Quality Tuning Using Nonlinear Optimization. In 2018 25th IEEE International Conference on Image Processing (ICIP). 2471–2475.Google Scholar
    47. Sylvain Paris, Samuel W Hasinoff, and Jan Kautz. 2011. Local Laplacian filters: Edge-aware image processing with a Laplacian pyramid. ACM Trans. Graph. 30, 4 (2011). Google ScholarDigital Library
    48. Pieter Peers, Dhruv K Mahajan, Bruce Lamond, Abhijeet Ghosh, Wojciech Matusik, Ravi Ramamoorthi, and Paul Debevec. 2009. Compressive light transport sensing. ACM Transactions on Graphics (TOG) 28, 1 (2009), 3. Google ScholarDigital Library
    49. Jonathan B. Phillips and Henrik Eliasson. 2018. Camera Image Quality Benchmarking (1st ed.). Wiley Publishing. Google ScholarDigital Library
    50. MJD Powell. 1965. A method for minimizing a sum of squares of non-linear functions without calculating derivatives. Comput. J. 7, 4 (1965), 303–307.Google ScholarCross Ref
    51. R. Ramanath, W. Snyder, Y. Yoo, and M. Drew. 2005. Color image processing pipeline in digital still cameras. IEEE Signal Processing Magazine 22, 1 (2005), 34–43.Google ScholarCross Ref
    52. L. A. Rastrigin. 1974. Systems of extremal control. Nauka (1974). https://ci.nii.ac.jp/naid/10018403158/en/Google Scholar
    53. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
    54. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention.Google Scholar
    55. Stefan Roth and Michael J Black. 2005. Fields of experts: A framework for learning image priors. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. Google ScholarDigital Library
    56. Pradeep Sen, Billy Chen, Gaurav Garg, Stephen R Marschner, Mark Horowitz, Marc Levoy, and Hendrik Lensch. 2005. Dual photography. ACM Transactions on Graphics (TOG) 24, 3 (2005), 745–755. Google ScholarDigital Library
    57. Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. 2016. Taking the human out of the loop: A review of bayesian optimization. IEEE 104, 1 (2016), 148–175.Google ScholarCross Ref
    58. L. Shao, R. Yan, X. Li, and Y. Liu. 2014. From Heuristic Optimization to Dictionary Learning: A Review and Comprehensive Comparison of Image Denoising Algorithms. IEEE Transactions on Cybernetics 44, 7 (2014), 1001–1013.Google ScholarCross Ref
    59. Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
    60. Vincent Sitzmann, Steven Diamond, Yifan Peng, Xiong Dun, Stephen Boyd, Wolfgang Heidrich, Felix Heide, and Gordon Wetzstein. 2018. End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging. ACM Transactions on Graphics (TOG) 37, 4 (2018), 114. Google ScholarDigital Library
    61. Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
    62. R. Stead. 2016. P2020 – Standard for Automotive System Image Quality. https://standards.ieee.org/develop/project/2020.html. (2016).Google Scholar
    63. David G Stork and Patrick R Gill. 2013. Lensless ultra-miniature CMOS computational imagers and sensors. (2013).Google Scholar
    64. David G Stork and Patrick R Gill. 2014. Optical, mathematical, and computational foundations of lensless ultra-miniature diffractive imagers and sensors. International Journal on Advances in Systems and Measurements 7, 3 (2014), 4.Google Scholar
    65. Kevin Swersky, Jasper Snoek, and Ryan P Adams. 2013. Multi-task bayesian optimization. In Advances in neural information processing systems. 2004–2012. Google ScholarDigital Library
    66. Hossein Talebi and Peyman Milanfar. 2014. Global image denoising. IEEE Trans. Image Process 23, 2 (2014), 755–768. Google ScholarDigital Library
    67. Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. In Computer Vision, 1998. Sixth International Conference on. IEEE, 839–846. Google ScholarDigital Library
    68. Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2017. Deep image prior. In IEEE International Conference on Computer Vision and Pattern Recognition.Google Scholar
    69. Li Xu, Jimmy Ren, Qiong Yan, Renjie Liao, and Jiaya Jia. 2015. Deep Edge-Aware Filters. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research), Francis Bach and David Blei (Eds.), Vol. 37. PMLR, Lille, France, 1669–1678. http://proceedings.mlr.press/v37/xub15.html Google ScholarDigital Library
    70. Hao Zhang, Wenjiang Liu, Ruolin Wang, Tao Liu, and Mengtian Rong. 2016. Hardware architecture design of block-matching and 3D-filtering denoising algorithm. Journal of Shanghai Jiaotong University (Science) 21, 2 (2016), 173–183.Google ScholarCross Ref
    71. Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2017. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing 26, 7 (2017), 3142–3155. Google ScholarDigital Library
    72. Lei Zhang, Weisheng Dong, David Zhang, and Guangming Shi. 2010. Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition 43, 4 (2010), 1531–1549. Google ScholarDigital Library
    73. L. Zhang, X. Wu, A. Buades, and X. Li. 2011. Color demosaicking by local directional interpolation and nonlocal adaptive thresholding. Journal of Electronic Imaging 20, 2 (2011).Google Scholar
    74. Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
    75. Daniel Zoran and Yair Weiss. 2011. From Learning Models of Natural Image Patches to Whole Image Restoration.Google Scholar

ACM Digital Library Publication:

Overview Page: