“Real-time user-guided image colorization with learned deep priors”

  • ©Richard Zhang, Jun-Yan Zhu, Phillip Isola, Xinyang Geng, Angela S Lin, Yu Tianhe, and Alexei A. Efros



Session Title:

    Deep Image Processing


    Real-time user-guided image colorization with learned deep priors




    We propose a deep learning approach for user-guided image colorization. The system directly maps a grayscale image, along with sparse, local user “hints” to an output colorization with a Convolutional Neural Network (CNN). Rather than using hand-defined rules, the network propagates user edits by fusing low-level cues along with high-level semantic information, learned from large-scale data. We train on a million images, with simulated user inputs. To guide the user towards efficient input selection, the system recommends likely colors based on the input image and current user inputs. The colorization is performed in a single feed-forward pass, enabling real-time use. Even with randomly simulated user inputs, we show that the proposed system helps novice users quickly create realistic colorizations, and offers large improvements in colorization quality with just a minute of use. In addition, we demonstrate that the framework can incorporate other user “hints” to the desired colorization, showing an application to color histogram transfer.


    1. Xiaobo An and Fabio Pellacini. 2008. AppProp: all-pairs appearance-space edit propagation. 27, 3 (2008), 40.Google Scholar
    2. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan Goldman. 2009. Patch-Match: a randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics (TOG) 28, 3 (2009), 24.Google ScholarDigital Library
    3. Jonathan T Barron and Ben Poole. 2016. The Fast Bilateral Solver. ECCV.Google Scholar
    4. Sean Bell, Paul Upchurch, Noah Snavely, and Kavita Bala. 2015. Material recognition in the wild with the materials in context database. In CVPR. 3479–3487. Google ScholarCross Ref
    5. Huiwen Chang, Ohad Fried, Yiming Liu, Stephen DiVerdi, and Adam Finkelstein. 2015. Palette-based photo recoloring. ACM Transactions on Graphics (TOG) 34, 4 (2015), 139.Google ScholarDigital Library
    6. Guillaume Charpiat, Matthias Hofmann, and Bernhard Schölkopf. 2008. Automatic image colorization via multimodal predictions. In ECCV. Springer, 126–139. Google ScholarDigital Library
    7. Xiaowu Chen, Dongqing Zou, Qinping Zhao, and Ping Tan. 2012. Manifold preserving edit propagation. ACM Transactions on Graphics (TOG) 31, 6 (2012), 132.Google ScholarDigital Library
    8. Li Cheng and SVN Vishwanathan. 2007. Learning to compress images and videos. In Proceedings of the 24th international conference on Machine learning. ACM, 161–168. Google ScholarDigital Library
    9. Zezhou Cheng, Qingxiong Yang, and Bin Sheng. 2015. Deep Colorization. In ICCV. 415–423. Google ScholarDigital Library
    10. Alex Yong-Sang Chia, Shaojie Zhuo, Raj Kumar Gupta, Yu-Wing Tai, Siu-Yeung Cho, Ping Tan, and Stephen Lin. 2011. Semantic colorization with internet images. In ACM Transactions on Graphics (TOG), Vol. 30. ACM, 156. Google ScholarDigital Library
    11. Aditya Deshpande, Jason Rock, and David Forsyth. 2015. Learning Large-Scale Automatic Image Colorization. In ICCV. 567–575. Google ScholarDigital Library
    12. Yuki Endo, Satoshi Iizuka, Yoshihiro Kanamori, and Jun Mitani. 2016. DeepProp: Extracting Deep Features from a Single Image for Edit Propagation. In Computer Graphics Forum, Vol. 35. Wiley Online Library, 189–201.Google Scholar
    13. Kevin Frans. 2017. Outline Colorization through Tandem Adversarial Networks. In arXiv:1704.08834.Google Scholar
    14. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In CVPR. 2414–2423.Google Scholar
    15. Michael Gharbi, Gaurav Chaurasia, Sylvain Paris, and Frédo Durand. 2016. Deep joint demosaicking and denoising. ACM Transactions on Graphics (TOG) 35, 6 (2016), 191.Google ScholarDigital Library
    16. Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR. Google ScholarDigital Library
    17. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS. 2672–2680.Google Scholar
    18. Raj Kumar Gupta, Alex Yong-Sang Chia, Deepu Rajan, Ee Sin Ng, and Huang Zhiyong. 2012. Image colorization using similar images. In Proceedings of the 20th ACM international conference on Multimedia. ACM, 369–378. Google ScholarDigital Library
    19. Bharath Hariharan, Pablo Arbeláez, Ross Girshick, and Jitendra Malik. 2015. Hyper-columns for object segmentation and fine-grained localization. In CVPR. 447–456.Google Scholar
    20. Xiaofei He, Ming Ji, and Hujun Bao. 2009. A unified active and semi-supervised learning framework for image compression. In CVPR. IEEE, 65–72.Google Scholar
    21. Aaron Hertzmann, Charles E Jacobs, Nuria Oliver, Brian Curless, and David H Salesin. 2001. Image analogies. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM, 327–340. Google ScholarDigital Library
    22. Yi-Chin Huang, Yi-Shin Tung, Jun-Cheng Chen, Sung-Wen Wang, and Ja-Ling Wu. 2005. An adaptive edge detection based colorization algorithm and its applications. In Proceedings of the 13th annual ACM international conference on Multimedia. ACM, 351–354. Google ScholarDigital Library
    23. Peter J Huber. 1964. Robust estimation of a location parameter. The Annals of Mathematical Statistics 35, 1 (1964), 73–101.Google ScholarCross Ref
    24. Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification. SIGGRAPH 35, 4 (2016). Google ScholarDigital Library
    25. Preferred Networks Inc. 2017. Paints Chainer. (2017). https://github.com/pfnet/PaintsChainerGoogle Scholar
    26. Revital Irony, Daniel Cohen-Or, and Dani Lischinski. 2005. Colorization by example. In Eurographics Symp. on Rendering, Vol. 2. Citeseer.Google Scholar
    27. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. CVPR (2017).Google Scholar
    28. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In NIPS. 1097–1105.Google Scholar
    29. Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. 2016. Learning Representations for Automatic Colorization. ECCV. Google ScholarCross Ref
    30. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324. Google ScholarCross Ref
    31. Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization using optimization. In ACM Transactions on Graphics (TOG), Vol. 23. ACM, 689–694. Google ScholarDigital Library
    32. Xujie Li, Hanli Zhao, Guizhi Nie, and Hui Huang. 2015. Image recoloring using geodesic distance based color harmonization. Computational Visual Media 1, 2 (2015), 143–155. Google ScholarCross Ref
    33. Yuanzhen Li, Edward Adelson, and Aseem Agarwala. 2008. ScribbleBoost: Adding Classification to Edge-Aware Interpolation of Local Image and Video Adjustments. In Computer Graphics Forum, Vol. 27. Wiley Online Library, 1255–1264.Google ScholarDigital Library
    34. Xiaopei Liu, Liang Wan, Yingge Qu, Tien-Tsin Wong, Stephen Lin, Chi-Sing Leung, and Pheng-Ann Heng. 2008. Intrinsic colorization. In ACM Transactions on Graphics (TOG), Vol. 27. ACM, 152. Google ScholarDigital Library
    35. Yiming Liu, Michael Cohen, Matt Uyttendaele, and Szymon Rusinkiewicz. 2014. AutoStyle: automatic style transfer from image collections to users’ images. In Computer Graphics Forum, Vol. 33. Wiley Online Library, 21–31.Google ScholarDigital Library
    36. Qing Luan, Fang Wen, Daniel Cohen-Or, Lin Liang, Ying-Qing Xu, and Heung-Yeung Shum. 2007. Natural image colorization. In Proceedings of the 18th Eurographics conference on Rendering Techniques. Eurographics Association, 309–320.Google ScholarDigital Library
    37. Yuji Morimoto, Yuichi Taguchi, and Takeshi Naemura. 2009. Automatic colorization of grayscale images using multiple images on the web. In SIGGRAPH’09: Posters. ACM, 32.Google ScholarDigital Library
    38. Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, and Alexei Efros. 2016. Context Encoders: Feature Learning by Inpainting. In CVPR.Google Scholar
    39. Yingge Qu, Tien-Tsin Wong, and Pheng-Ann Heng. 2006. Manga colorization. In ACM Transactions on Graphics (TOG), Vol. 25. ACM, 1214–1220. Google ScholarDigital Library
    40. Erik Reinhard, Michael Ashikhmin, Bruce Gooch, and Peter Shirley. 2001. Color Transfer Between Images. IEEE Comput. Graph. Appl. 21, 5 (Sept. 2001), 34–41. Google ScholarDigital Library
    41. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 234–241. Google ScholarCross Ref
    42. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, and others. 2015. Imagenet large scale visual recognition challenge. IJCV 115, 3 (2015). Google ScholarDigital Library
    43. Patsorn Sangkloy, Jingwan Lu, Chen Fang, Fisher Yu, and James Hays. 2017. Scribbler: Controlling Deep Image Synthesis with Sketch and Color. CVPR (2017).Google Scholar
    44. Ahmed Selim, Mohamed Elgharib, and Linda Doyle. 2016. Painting style transfer for head portraits using convolutional neural networks. ACM Transactions on Graphics (TOG) 35, 4 (2016), 129.Google ScholarDigital Library
    45. Edgar Simo-Serra, Satoshi Iizuka, Kazuma Sasaki, and Hiroshi Ishikawa. 2016. Learning to simplify: fully convolutional networks for rough sketch cleanup. ACM Transactions on Graphics (TOG) 35, 4 (2016), 121.Google ScholarDigital Library
    46. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
    47. Baoyuan Wang, Yizhou Yu, Tien-Tsin Wong, Chun Chen, and Ying-Qing Xu. 2010. Data-driven image color theme enhancement. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 146. Google ScholarDigital Library
    48. Ting-Chun Wang, Jun-Yan Zhu, Ebi Hiroaki, Manmohan Chandraker, Alexei A Efros, and Ravi Ramamoorthi. 2016. A 4D light-field dataset and CNN architectures for material recognition. In ECCV. Springer, 121–138.Google Scholar
    49. Tomihisa Welsh, Michael Ashikhmin, and Klaus Mueller. 2002. Transferring color to greyscale images. ACM Transactions on Graphics (TOG) 21, 3 (2002), 277–280. Google ScholarDigital Library
    50. Saining Xie and Zhuowen Tu. 2015. Holistically-nested edge detection. In ICCV. Google ScholarDigital Library
    51. Kun Xu, Yong Li, Tao Ju, Shi-Min Hu, and Tian-Qiang Liu. 2009. Efficient affinity-based edit propagation using kd tree. ACM Transactions on Graphics (TOG) 28, 5 (2009), 118.Google ScholarDigital Library
    52. Li Xu, Qiong Yan, and Jiaya Jia. 2013. A sparse control model for image and video editing. ACM Transactions on Graphics (TOG) 32, 6 (2013), 197.Google ScholarDigital Library
    53. Ning Xu, Brian Price, Scott Cohen, Jimei Yang, and Thomas S Huang. 2016. Deep interactive object selection. In CVPR.Google Scholar
    54. Zhicheng Yan, Hao Zhang, Baoyuan Wang, Sylvain Paris, and Yizhou Yu. 2016. Automatic photo adjustment using deep neural networks. ACM Transactions on Graphics (TOG) 35, 2 (2016), 11.Google ScholarDigital Library
    55. Fisher Yu and Vladlen Koltun. 2016. Multi-Scale Context Aggregation by Dilated Convolutions. International Conference on Learning Representations (2016).Google Scholar
    56. Richard Zhang, Phillip Isola, and Alexei A Efros. 2016. Colorful Image Colorization. ECCV (2016).Google Scholar
    57. Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning deep features for scene recognition using places database. In NIPS.Google Scholar
    58. Jun-Yan Zhu, Philipp Krahenbuhl, Eli Shechtman, and Alexei A Efros. 2015. Learning a discriminative model for the perception of realism in composite images. In CVPR.Google Scholar
    59. Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A Efros. 2016. Generative visual manipulation on the natural image manifold. (2016), 597–613.Google Scholar

ACM Digital Library Publication: