“Automatic Photo Adjustment Using Deep Neural Networks” by Yan, Zhang, Wang, Paris and Yu

  • ©Zhicheng Yan, Hao Zhang, Baoyuan Wang, Sylvain Paris, and Yizhou Yu




    Automatic Photo Adjustment Using Deep Neural Networks





    Photo retouching enables photographers to invoke dramatic visual impressions by artistically enhancing their photos through stylistic color and tone adjustments. However, it is also a time-consuming and challenging task that requires advanced skills beyond the abilities of casual photographers. Using an automated algorithm is an appealing alternative to manual work, but such an algorithm faces many hurdles. Many photographic styles rely on subtle adjustments that depend on the image content and even its semantics. Further, these adjustments are often spatially varying. Existing automatic algorithms are still limited and cover only a subset of these challenges. Recently, deep learning has shown unique abilities to address hard problems. This motivated us to explore the use of deep neural networks (DNNs) in the context of photo editing. In this article, we formulate automatic photo adjustment in a manner suitable for this approach. We also introduce an image descriptor accounting for the local semantics of an image. Our experiments demonstrate that training DNNs using these descriptors successfully capture sophisticated photographic styles. In particular and unlike previous techniques, it can model local adjustments that depend on image semantics. We show that this yields results that are qualitatively and quantitatively better than previous work.


    1. Xiaobo An and Fabio Pellacini. 2008. AppProp: All-pairs appearance-space edit propagation. ACM Transactions on Graphics 27, 3, Article No. 40. 
    2. Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. 2011. Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 5, 898–916. 
    3. Soonmin Bae, Sylvain Paris, and Frédo Durand. 2006. Two-scale tone management for photographic look. ACM Transactions on Graphics 25, 3, 637–645. 
    4. S. Belongie, J. Malik, and J. Puzicha. 2002. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 4, 509–522. 
    5. Leo Breiman. 2001. Random forests. Machine Learning 45, 1, 5–32. 
    6. V. Bychkovsky, S. Paris, E. Chan, and F. Durand. 2011. Learning photographic global tonal adjustment with a database of input/output image pairs. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 97–104. 
    7. J. C. Caicedo, A. Kapoor, and S. B. Kang. 2011. Collaborative personalization of image enhancement. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 249–256. 
    8. Daniel Cohen-Or, Olga Sorkine, Ran Gal, Tommer Leyvand, and Ying-Qing Xu. 2006. Color harmonization. ACM Transactions on Graphics 25, 3, 624–630. 
    9. K. Dale, M. K. Johnson, K. Sunkavalli, W. Matusik, and H. Pfister. 2009. Image restoration using online photo collections. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. 2217–2224.
    10. Pedro Felzenszwalb, David McAllester, and Deva Ramanan. 2008. A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
    11. Pedro F. Felzenszwalb and Daniel P. Huttenlocher. 2004. Efficient graph-based image segmentation. International Journal of Computer Vision 59, 2, 167–181. 
    12. Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. CoRR abs/1207.0580.
    13. K. Hornik, M. Stinchcombe, and H. White. 1989. Multilayer feedforward networks are universal approximators. Neural Networks 2, 5, 359–366. 
    14. Sung Ju Hwang, Ashish Kapoor, and Sing Bing Kang. 2012. Context-based automatic local image enhancement. In Proceedings of the 12th European Conference on Computer Vision, Part I (ECCV’12). 569–582. 
    15. Neel Joshi, Wojciech Matusik, Edward H. Adelson, and David J. Kriegman. 2010. Personal photo enhancement using example images. ACM Transactions on Graphics 29, 2, Article No. 12. 
    16. S. B. Kang, A. Kapoor, and D. Lischinski. 2010. Personalization of image enhancement. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10). 1799–1806.
    17. Liad Kaufman, Dani Lischinski, and Michael Werman. 2012. Content-aware automatic photo enhancement. Computer Graphics Forum 31, 8, 2528–2540. 
    18. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, P. Bartlett, F. C. N. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). 1106–1114.
    19. Dani Lischinski, Zeev Farbman, Matthew Uyttendaele, and Richard Szeliski. 2006. Interactive local adjustment of tonal values. ACM Transactions on Graphics 25, 3, 646–653. 
    20. C. Liu, J. Yuen, and A. Torralba. 2011. Nonparametric scene parsing via label transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 12, 2368–2382. 
    21. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
    22. Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58, 1, 267–288.
    23. Joseph Tighe and Svetlana Lazebnik. 2010. Superparsing: Scalable nonparametric image parsing with superpixels. In Proceedings of the 11th European Conference on Computer Vision: Part V (ECCV’10). 352–365. 
    24. P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the International Conference on Machine Learning. 1096–1103. 
    25. P. Viola and M. Jones. 2001. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
    26. Baoyuan Wang, Yizhou Yu, and Ying-Qing Xu. 2011. Example-based image color and tone style enhancement. In Proceedings of ACM SIGGRAPH 2011 Papers (SIGGRAPH’11). 64:1–64:12. 
    27. Xiaoyu Wang, Ming Yang, Shenghuo Zhu, and Yuanqing Lin. 2013. Regionlets for generic object detection. In Proceedings of the IEEE 14th International Conference on Computer Vision (ICCV’13). 
    28. Matthew D Zeiler and Rob Fergus. 2013. Visualizing and understanding convolutional neural networks. arXiv preprint arXiv:1311.2901.

ACM Digital Library Publication:

Overview Page: