Hyperparameter optimization in black-box image processing using differentiable proxies

Nearly every commodity imaging system we directly interact with, or indirectly rely on, leverages power efficient, application-adjustable black-box hardware image signal processing (ISPs) units, running either in dedicated hardware blocks, or as proprietary software modules on programmable hardware. The configuration parameters of these black-box ISPs often have complex interactions with the output image, and must be adjusted prior to deployment according to application-specific quality and performance metrics. Today, this search is commonly performed manually by “golden eye” experts or algorithm developers leveraging domain expertise. We present a fully automatic system to optimize the parameters of black-box hardware and software image processing pipelines according to any arbitrary (i.e., application-specific) metric. We leverage a differentiable mapping between the configuration space and evaluation metrics, parameterized by a convolutional neural network that we train in an end-to-end fashion with imaging hardware in-the-loop. Unlike prior art, our differentiable proxies allow for high-dimension parameter search with stochastic first-order optimizers, without explicitly modeling any lower-level image processing transformations. As such, we can efficiently optimize black-box image processing pipelines for a variety of imaging applications, reducing application-specific configuration times from months to hours. Our optimization method is fully automatic, even with black-box hardware in the loop. We validate our method on experimental data for real-time display applications, object detection, and extreme low-light imaging. The proposed approach outperforms manual search qualitatively and quantitatively for all domain-specific applications tested. When applied to traditional denoisers, we demonstrate that—just by changing hyperparameters—traditional algorithms can outperform recent deep learning methods by a substantial margin on recent benchmarks.

References:

1. Abdelrahman Abdelhamed, Stephen Lin, and Michael S Brown. 2018. A High-Quality Denoising Dataset for Smartphone Cameras. In IEEE Conference on Computer Vision and Pattern Recognition. 1692–1700.Google ScholarCross Ref
2. D. Ackley. 2012. A Connectionist Machine for Genetic Hillclimbing. Springer US.Google Scholar
3. Michal Aharon, Michael Elad, Alfred Bruckstein, et al. 2006. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on signal processing 54, 11 (2006), 4311. Google ScholarDigital Library
4. Brendan Barry, Cormac Brick, Fergal Connor, David Donohoe, David Moloney, Richard Richmond, Martin O’Riordan, and Vasile Toma. 2015. Always-on vision processing unit for mobile applications. IEEE Micro 35, 2 (2015), 56–66.Google ScholarDigital Library
5. Donald Baxter, Frederic Cao, Henrik Eliasson, and Jonathan Phillips. 2012. Development of the I3A CPIQ spatial metrics. Proc.SPIE 8293.Google ScholarCross Ref
6. James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, Feb (2012), 281–305. Google ScholarDigital Library
7. James Bergstra, Dan Yamins, and David D Cox. 2013. Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. In 12th Python in Science Conference. Citeseer, 13–20.Google ScholarCross Ref
8. A. Buades, B. Coll, and J.-M. Morel. 2005. A non-local algorithm for image denoising. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. 60–65. Google ScholarDigital Library
9. Harold Burger, Christian Schuler, and Stefan Harmeling. 2012. Image denoising: Can plain neural networks compete with BM3D?. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
10. Chakravarty R. Alla Chaitanya, Anton S. Kaplanyan, Christoph Schied, Marco Salvi, Aaron Lefohn, Derek Nowrouzezahrai, and Timo Aila. 2017. Interactive Reconstruction of Monte Carlo Image Sequences Using a Recurrent Denoising Autoencoder. ACM Trans. Graph. 36, 4 (July 2017). Google ScholarDigital Library
11. C. Chen, Q. Chen, J. Xu, and V. Koltun. 2018. Learning to See in the Dark. ArXiv e-prints (May 2018). arXiv:1805.01934Google Scholar
12. Q. Chen, J. Xu, and V. Koltun. 2017. Fast Image Processing with Fully-Convolutional Networks. In 2017 IEEE International Conference on Computer Vision (ICCV). 2516–2525.Google Scholar
13. Yunjin Chen and Thomas Pock. 2017. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE transactions on pattern analysis and machine intelligence 39, 6 (2017), 1256–1272. Google ScholarDigital Library
14. J. Choi, S. Jang, S. Lee, Y. Hwang, and B. H. Choi. 2014. Memory optimization of bilateral filter and its hardware implementation. In The 18th IEEE International Symposium on Consumer Electronics (ISCE 2014). 1–2.Google ScholarCross Ref
15. K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. 2007. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Processing 16, 8 (2007). Google ScholarDigital Library
16. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In IEEE Conf. on Computer Vision and Pattern Recognition. 248–255.Google ScholarCross Ref
17. Michael Elad and Michal Aharon. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image processing 15, 12 (2006), 3736–3745. Google ScholarDigital Library
18. Qingnan Fan, Jiaolong Yang, David Wipf, Baoquan Chen, and Xin Tong. 2018. Image Smoothing via Unsupervised Learning. ACM Transactions on Graphics (Proceedings of SIGGRAPH ASIA 2018) 37, 6 (2018). Google ScholarDigital Library
19. Kenneth Garrard, Thomas Bruegge, Jeff Hoffman, Thomas Dow, and Alex Sohn. 2005. Design tools for freeform optics. In Current Developments in Lens Design and Optical Engineering VI, Vol. 5874. International Society for Optics and Photonics, 58740A.Google ScholarCross Ref
20. Carl Friedrich Gauss. 1843. Dioptrische Untersuchungen von CF Gauss. in der Dieterichschen Buchhandlung.Google Scholar
21. Joseph M Geary. 2002. Introduction to lens design: with practical ZEMAX examples. Willmann-Bell Richmond.Google Scholar
22. Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research 32, 11 (2013), 1231–1237. Google ScholarDigital Library
23. M. Gharbi, G. Chaurasia, S. Paris, and F. Durand. 2016. Deep joint demosaicking and denoising. ACM Transactions on Graphics (TOG) 35, 6 (2016), 191. Google ScholarDigital Library
24. M. Gharbi, J. Chen, J. Barron, S. Hasinoff, and F. Durand. 2017. Deep Bilateral Learning for Real-Time Image Enhancement. ACM Trans. Graph. (SIGGRAPH) (2017). Google ScholarDigital Library
25. Radek Grzeszczuk, Demetri Terzopoulos, and Geoffrey Hinton. 1998. NeuroAnimator: Fast Neural Network Emulation and Control of Physics-based Models. In Proc. of the 25th Annual Conf. on Computer Graphics and Interactive Techniques (SIGGRAPH). ACM. Google ScholarDigital Library
26. Shuhang Gu, Lei Zhang, Wangmeng Zuo, and Xiangchu Feng. 2014. Weighted nuclear norm minimization with application to image denoising. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
27. Shi Guo, Zifei Yan, Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018. Toward convolutional blind denoising of real photographs. arXiv preprint arXiv:1807.04686 (2018).Google Scholar
28. Mohit Gupta, Amit Agrawal, Ashok Veeraraghavan, and Srinivasa G Narasimhan. 2011. Structured light 3D scanning in the presence of global illumination. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 713–720. Google ScholarDigital Library
29. Nikolaus Hansen, Sibylle D Müller, and Petros Koumoutsakos. 2003. Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary computation 11, 1 (2003), 1–18. Google ScholarDigital Library
30. S. Hasinoff, D. Sharlet, R. Geiss, A. Adams, J. Barron, F. Kainz, J. Chen, and M. Levoy. 2016. Burst Photography for High Dynamic Range and Low-light Imaging on Mobile Cameras. ACM Trans. Graph. 35, 6, Article 192 (2016), 12 pages. Google ScholarDigital Library
31. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
32. F. Heide, M. Steinberger, Y.-T. Tsai, M. Rouf, D. Pajak, D. Reddy, O. Gallo, J. Liu, W. Heidrich, K. Egiazarian, J. Kautz, and K. Pulli. 2014. FlexISP: A flexible camera image processing framework. ACM Trans. Graph. (SIGGRAPH Asia) 33, 6 (2014). Google ScholarDigital Library
33. ISO. {n. d.}a. ISO 1858. https://standards.ieee.org/standard/1858-2016.html. ({n. d.}). {Online; accessed 5-January-2019}.Google Scholar
34. ISO. {n. d.}b. ISO 71696. https://www.iso.org/standard/71696.htm. ({n. d.}). {Online; accessed 5-January-2019}.Google Scholar
35. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
36. Momin Jamil and Xin-She Yang. 2013. A Literature Survey of Benchmark Functions For Global Optimization Problems. CoRR abs/1308.4008 (2013). arXiv:1308.4008 http://arxiv.org/abs/1308.4008Google Scholar
37. Norman Koren. 2006. The Imatest program: comparing cameras with different amounts of sharpening. In Digital Photography II, Vol. 6069. International Society for Optics and Photonics, 60690L.Google ScholarCross Ref
38. Tzu-Mao Li, Miika Aittala, Frédo Durand, and Jaakko Lehtinen. 2018. Differentiable Monte Carlo Ray Tracing through Edge Sampling. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 37, 6 (2018), 222:1–222:11. Google ScholarDigital Library
39. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.Google ScholarCross Ref
40. Hsueh-Ti Derek Liu, Michael Tao, Chun-Liang Li, Derek Nowrouzezahrai, and Alec Jacobson. 2019. Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer. In International Conference on Learning Representations.Google Scholar
41. Ilya Loshchilov, Tobias Glasmachers, and Hans-Georg Beyer. 2017. Limited-Memory Matrix Adaptation for Large Scale Black-box Optimization. CoRR abs/1705.06693 (2017). http://arxiv.org/abs/1705.06693Google Scholar
42. Daniel Malacara-Hernández and Zacarías Malacara-Hernández. 2016. Handbook of optical design. CRC Press.Google Scholar
43. Ruben Martinez-Cantin. 2014. Bayesopt: A bayesian optimization library for nonlinear optimization, experimental design and bandits. The Journal of Machine Learning Research 15, 1 (2014), 3735–3739. Google ScholarDigital Library
44. ON Semi MT9P111. 2015. MT9P111: 1/4-Inch 5 Mp System-On-A-Chip (SOC) CMOS Digital Image Sensor. http://www.onsemi.com/pub/Collateral/MT9P111-D.PDF. (2015).Google Scholar
45. John A Nelder and Roger Mead. 1965. A simplex method for function minimization. The computer journal 7, 4 (1965), 308–313.Google Scholar
46. J. Nishimura, T. Gerasimow, R. Sushma, A. Sutic, C. Wu, and G. Michael. 2018. Automatic ISP Image Quality Tuning Using Nonlinear Optimization. In 2018 25th IEEE International Conference on Image Processing (ICIP). 2471–2475.Google Scholar
47. Sylvain Paris, Samuel W Hasinoff, and Jan Kautz. 2011. Local Laplacian filters: Edge-aware image processing with a Laplacian pyramid. ACM Trans. Graph. 30, 4 (2011). Google ScholarDigital Library
48. Pieter Peers, Dhruv K Mahajan, Bruce Lamond, Abhijeet Ghosh, Wojciech Matusik, Ravi Ramamoorthi, and Paul Debevec. 2009. Compressive light transport sensing. ACM Transactions on Graphics (TOG) 28, 1 (2009), 3. Google ScholarDigital Library
49. Jonathan B. Phillips and Henrik Eliasson. 2018. Camera Image Quality Benchmarking (1st ed.). Wiley Publishing. Google ScholarDigital Library
50. MJD Powell. 1965. A method for minimizing a sum of squares of non-linear functions without calculating derivatives. Comput. J. 7, 4 (1965), 303–307.Google ScholarCross Ref
51. R. Ramanath, W. Snyder, Y. Yoo, and M. Drew. 2005. Color image processing pipeline in digital still cameras. IEEE Signal Processing Magazine 22, 1 (2005), 34–43.Google ScholarCross Ref
52. L. A. Rastrigin. 1974. Systems of extremal control. Nauka (1974). https://ci.nii.ac.jp/naid/10018403158/en/Google Scholar
53. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
54. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention.Google Scholar
55. Stefan Roth and Michael J Black. 2005. Fields of experts: A framework for learning image priors. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. Google ScholarDigital Library
56. Pradeep Sen, Billy Chen, Gaurav Garg, Stephen R Marschner, Mark Horowitz, Marc Levoy, and Hendrik Lensch. 2005. Dual photography. ACM Transactions on Graphics (TOG) 24, 3 (2005), 745–755. Google ScholarDigital Library
57. Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. 2016. Taking the human out of the loop: A review of bayesian optimization. IEEE 104, 1 (2016), 148–175.Google ScholarCross Ref
58. L. Shao, R. Yan, X. Li, and Y. Liu. 2014. From Heuristic Optimization to Dictionary Learning: A Review and Comprehensive Comparison of Image Denoising Algorithms. IEEE Transactions on Cybernetics 44, 7 (2014), 1001–1013.Google ScholarCross Ref
59. Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
60. Vincent Sitzmann, Steven Diamond, Yifan Peng, Xiong Dun, Stephen Boyd, Wolfgang Heidrich, Felix Heide, and Gordon Wetzstein. 2018. End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging. ACM Transactions on Graphics (TOG) 37, 4 (2018), 114. Google ScholarDigital Library
61. Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
62. R. Stead. 2016. P2020 – Standard for Automotive System Image Quality. https://standards.ieee.org/develop/project/2020.html. (2016).Google Scholar
63. David G Stork and Patrick R Gill. 2013. Lensless ultra-miniature CMOS computational imagers and sensors. (2013).Google Scholar
64. David G Stork and Patrick R Gill. 2014. Optical, mathematical, and computational foundations of lensless ultra-miniature diffractive imagers and sensors. International Journal on Advances in Systems and Measurements 7, 3 (2014), 4.Google Scholar
65. Kevin Swersky, Jasper Snoek, and Ryan P Adams. 2013. Multi-task bayesian optimization. In Advances in neural information processing systems. 2004–2012. Google ScholarDigital Library
66. Hossein Talebi and Peyman Milanfar. 2014. Global image denoising. IEEE Trans. Image Process 23, 2 (2014), 755–768. Google ScholarDigital Library
67. Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. In Computer Vision, 1998. Sixth International Conference on. IEEE, 839–846. Google ScholarDigital Library
68. Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2017. Deep image prior. In IEEE International Conference on Computer Vision and Pattern Recognition.Google Scholar
69. Li Xu, Jimmy Ren, Qiong Yan, Renjie Liao, and Jiaya Jia. 2015. Deep Edge-Aware Filters. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research), Francis Bach and David Blei (Eds.), Vol. 37. PMLR, Lille, France, 1669–1678. http://proceedings.mlr.press/v37/xub15.html Google ScholarDigital Library
70. Hao Zhang, Wenjiang Liu, Ruolin Wang, Tao Liu, and Mengtian Rong. 2016. Hardware architecture design of block-matching and 3D-filtering denoising algorithm. Journal of Shanghai Jiaotong University (Science) 21, 2 (2016), 173–183.Google ScholarCross Ref
71. Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2017. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing 26, 7 (2017), 3142–3155. Google ScholarDigital Library
72. Lei Zhang, Weisheng Dong, David Zhang, and Guangming Shi. 2010. Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition 43, 4 (2010), 1531–1549. Google ScholarDigital Library
73. L. Zhang, X. Wu, A. Buades, and X. Li. 2011. Color demosaicking by local directional interpolation and nonlocal adaptive thresholding. Journal of Electronic Imaging 20, 2 (2011).Google Scholar
74. Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
75. Daniel Zoran and Yair Weiss. 2011. From Learning Models of Natural Image Patches to Whole Image Restoration.Google Scholar

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2019: Technical Papers

“Hyperparameter optimization in black-box image processing using differentiable proxies” by Tseng, Yu, Yang, Mannan, Arnaud, et al. …

Conference:

Type(s):

Title:

Session/Category Title: Image Science

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: