“Language-based Photo Color Adjustment for Graphic Designs” by Wang, Zhao, Hancke and Lau

  • ©Zhenwei Wang, Nanxuan Zhao, Gerhard Hancke, and Rynson W. H. Lau




    Language-based Photo Color Adjustment for Graphic Designs

Session/Category Title: Colorful Topics in Imaging




    Adjusting the photo color to associate with some design elements is an essential way for a graphic design to effectively deliver its message and make it aesthetically pleasing. However, existing tools and previous works face a dilemma between the ease of use and level of expressiveness. To this end, we introduce an interactive language-based approach for photo recoloring, which provides an intuitive system that can assist both experts and novices on graphic design. Given a graphic design containing a photo that needs to be recolored, our model can predict the source colors and the target regions, and then recolor the target regions with the source colors based on the given language-based instruction. The multi-granularity of the instruction allows diverse user intentions. The proposed novel task faces several unique challenges, including: 1) color accuracy for recoloring with exactly the same color from the target design element as specified by the user; 2) multi-granularity instructions for parsing instructions correctly to generate a specific result or multiple plausible ones; and 3) locality for recoloring in semantically meaningful local regions to preserve original image semantics. To address these challenges, we propose a model called LangRecol with two main components: the language-based source color prediction module and the semantic-palette-based photo recoloring module. We also introduce an approach for generating a synthetic graphic design dataset with instructions to enable model training. We evaluate our model via extensive experiments and user studies. We also discuss several practical applications, showing the effectiveness and practicality of our approach. Please find the code and data at https://zhenwwang.github.io/langrecol.


    1. Mahmoud Afifi, Abdullah Abuolaim, Mostafa Hussien, Marcus A Brubaker, and Michael S Brown. 2021a. CAMS: Color-Aware Multi-Style Transfer. arXiv preprint arXiv:2106.13920 (2021).
    2. Mahmoud Afifi, Marcus A Brubaker, and Michael S Brown. 2021b. Histogan: Controlling colors of gan-generated and real images via color histograms. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7941–7950.
    3. Elad Aharoni-Mack, Yakov Shambik, and Dani Lischinski. 2017. Pigment-based recoloring of watercolor paintings. In Proceedings of the Symposium on Non-Photorealistic Animation and Rendering. 1–11.
    4. Yağiz Aksoy, Tunç Ozan Aydin, Aljoša Smolić, and Marc Pollefeys. 2017. Unmixing-based soft color segmentation for image manipulation. ACM Transactions on Graphics (TOG) 36, 2 (2017), 1–19.
    5. Yağiz Aksoy, Tae-Hyun Oh, Sylvain Paris, Marc Pollefeys, and Wojciech Matusik. 2018. Semantic soft segmentation. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–13.
    6. Maggie Aland. 2017. 10 tips and ideas to make a flyer that stands out. Retrieved Jan 03, 2022 from https://www.lucidpress.com/blog/10-creative-ways-to-make-flyer-stand-out
    7. ANL [n. d.]. Guide to Effective Poster Design. Retrieved Jan 03, 2022 from https://www.anl.gov/education/guide-to-effective-poster-design
    8. Benoit Arbelot, Romain Vergne, Thomas Hurtut, and Joëlle Thollot. 2017. Local texture-based color transfer and colorization. Computers & Graphics 62 (2017), 15–27.
    9. Soonmin Bae, Sylvain Paris, and Frédo Durand. 2006. Two-scale tone management for photographic look. ACM Transactions on Graphics (TOG) 25, 3 (2006), 637–645.
    10. Hyojin Bahng, Seungjoo Yoo, Wonwoong Cho, David Keetae Park, Ziming Wu, Xiaojuan Ma, and Jaegul Choo. 2018. Coloring with words: Guiding image colorization through text-based palette generation. In Proceedings of the european conference on computer vision (eccv). 431–447.
    11. David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, and Antonio Torralba. 2020. Semantic photo manipulation with a generative image prior. arXiv preprint arXiv:2005.07727 (2020).
    12. Brent Berlin and Paul Kay. 1991. Basic color terms: Their universality and evolution. Univ of California Press.
    13. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
    14. Canva [n. d.]. The best Google Font combinations to try. Retrieved May 06, 2022 from https://www.canva.com/learn/best-google-font-combinations/
    15. Huiwen Chang, Ohad Fried, Yiming Liu, Stephen DiVerdi, and Adam Finkelstein. 2015. Palette-based photo recoloring. ACM Trans. Graph. 34, 4 (2015), 139–1.
    16. Youngha Chang, Suguru Saito, Keiji Uchikawa, and Masayuki Nakajima. 2005. Example-based color stylization of images. ACM Trans. Appl. Percept. 2, 3 (2005), 322–345.
    17. Jianbo Chen, Yelong Shen, Jianfeng Gao, Jingjing Liu, and Xiaodong Liu. 2018. Language-based image editing with recurrent attentive models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8721–8729.
    18. Ming-Ming Cheng, Shuai Zheng, Wen-Yan Lin, Vibhav Vineet, Paul Sturgess, Nigel Crook, Niloy J Mitra, and Philip Torr. 2014. ImageSpirit: Verbal guided image parsing. ACM Transactions on Graphics (TOG) 34, 1 (2014), 1–11.
    19. Yu Cheng, Zhe Gan, Yitong Li, Jingjing Liu, and Jianfeng Gao. 2020. Sequential attention GAN for interactive image editing. In Proceedings of the 28th ACM International Conference on Multimedia. 4383–4391.
    20. Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
    21. Daniel Cohen-Or, Olga Sorkine, Ran Gal, Tommer Leyvand, and Ying-Qing Xu. 2006. Color harmonization. In ACM SIGGRAPH 2006 Papers. 624–630.
    22. Hao Dong, Simiao Yu, Chao Wu, and Yike Guo. 2017. Semantic image synthesis via adversarial learning. In Proceedings of the IEEE International Conference on Computer Vision. 5706–5714.
    23. Yuki Endo, Satoshi Iizuka, Yoshihiro Kanamori, and Jun Mitani. 2016. Deepprop: Extracting deep features from a single image for edit propagation. In Computer Graphics Forum, Vol. 35. Wiley Online Library, 189–201.
    24. Guang Feng, Zhiwei Hu, Lihe Zhang, and Huchuan Lu. 2021. Encoder fusion network with co-attention embedding for referring image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15506–15515.
    25. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).
    26. GraphicsZoo 2020. 15 Tips to Create Unique Poster Design for Your Brand. Retrieved Feb 16, 2022 from https://www.graphicszoo.com/article/15-tips-to-create-unique-poster-design-for-your-brand
    27. Yoav HaCohen, Eli Shechtman, Dan B Goldman, and Dani Lischinski. 2013. Optimizing color consistency in photo collections. ACM Transactions on Graphics (TOG) 32, 4 (2013), 1–10.
    28. Mingming He, Dongdong Chen, Jing Liao, Pedro V Sander, and Lu Yuan. 2018. Deep exemplar-based colorization. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–16.
    29. Jing Huang, Shizhe Zhou, Xianyi Zhu, Yiwen Li, and Chengfeng Zhou. 2018. Automatic image style transfer using emotion-palette. In Tenth International Conference on Digital Image Processing (ICDIP 2018), Vol. 10806. International Society for Optics and Photonics, 108064A.
    30. Yi-Chin Huang, Yi-Shin Tung, Jun-Cheng Chen, Sung-Wen Wang, and Ja-Ling Wu. 2005. An adaptive edge detection based colorization algorithm and its applications. In Proceedings of the 13th annual ACM international conference on Multimedia. 351–354.
    31. Yuming Jiang, Ziqi Huang, Xingang Pan, Chen Change Loy, and Ziwei Liu. 2021. Talk-to-Edit: Fine-Grained Facial Editing via Dialog. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13799–13808.
    32. Begoña Jordá-Albiñana, Olga Ampuero-Canellas, Natalia Vila, and José Ignacio Rojas-Sola. 2009. Brand identity documentation: a cross-national examination of identity standards manuals. International Marketing Review (2009).
    33. Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, and Michal Irani. 2022. Imagic: Text-based real image editing with diffusion models. arXiv preprint arXiv:2210.09276 (2022).
    34. Siavash Khodadadeh, Saeid Motiian, Zhe Lin, Ladislau Boloni, and Shabnam Ghadar. 2021. Automatic Object Recoloring Using Adversarial Learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1488–1496.
    35. Eungyeup Kim, Sanghyeon Lee, Jeonghoon Park, Somi Choi, Choonghyun Seo, and Jaegul Choo. 2021. Deep Edge-Aware Interactive Colorization against Color-Bleeding Effects. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14667–14676.
    36. EunJin Kim and Hyeon-Jeong Suk. 2018. Image color adjustment for harmony with a target color. Color Research & Application 43, 1 (2018), 75–88.
    37. Gierad P Laput, Mira Dontcheva, Gregg Wilensky, Walter Chang, Aseem Agarwala, Jason Linder, and Eytan Adar. 2013. Pixeltone: A multimodal interface for image editing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2185–2194.
    38. Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization using optimization. In ACM SIGGRAPH 2004 Papers. 689–694.
    39. Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, and Philip HS Torr. 2020a. Manigan: Text-guided image manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7880–7889.
    40. Bowen Li, Xiaojuan Qi, Philip Torr, and Thomas Lukasiewicz. 2020b. Lightweight generative adversarial networks for text-guided image manipulation. Advances in Neural Information Processing Systems 33 (2020), 22020–22031.
    41. Yuanzhen Li, Edward Adelson, and Aseem Agarwala. 2008. ScribbleBoost: Adding Classification to Edge-Aware Interpolation of Local Image and Video Adjustments. In Computer Graphics Forum, Vol. 27. Wiley Online Library, 1255–1264.
    42. Xihui Liu, Zhe Lin, Jianming Zhang, Handong Zhao, Quan Tran, Xiaogang Wang, and Hongsheng Li. 2020. Open-edit: Open-domain image manipulation with open-vocabulary instructions. In Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16. Springer, 89–106.
    43. Fujun Luan, Sylvain Paris, Eli Shechtman, and Kavita Bala. 2017. Deep photo style transfer. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4990–4998.
    44. Qing Luan, Fang Wen, Daniel Cohen-Or, Lin Liang, Ying-Qing Xu, and Heung-Yeung Shum. 2007. Natural image colorization. In Proceedings of the 18th Eurographics conference on Rendering Techniques. 309–320.
    45. Timo Lüddecke and Alexander Ecker. 2022. Image segmentation using text and image prompts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7086–7096.
    46. Rui Ma, Akshay Gadi Patil, Matthew Fisher, Manyi Li, Sören Pirk, Binh-Son Hua, Sai-Kit Yeung, Xin Tong, Leonidas Guibas, and Hao Zhang. 2018. Language-driven synthesis of 3D scenes from scene databases. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1–16.
    47. Sara Mcguire. 2019. Poster Design Guide: How to Make an Eye-Catching Poster in 2020. Retrieved Jan 03, 2022 from https://venngage.com/blog/poster-design/
    48. Seonghyeon Nam, Yunji Kim, and Seon Joo Kim. 2018. Text-adaptive generative adversarial networks: manipulating images with natural language. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 42–51.
    49. Rang MH Nguyen, Brian Price, Scott Cohen, and Michael S Brown. 2017. Group-Theme Recoloring for Multi-Image Color Consistency. In Computer Graphics Forum, Vol. 36. Wiley Online Library, 83–92.
    50. Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. 2021. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741 (2021).
    51. Peter O’Donovan, Aseem Agarwala, and Aaron Hertzmann. 2015. Designscape: Design with interactive layout suggestions. In Proceedings of the 33rd annual ACM conference on human factors in computing systems. 1221–1224.
    52. Peter O’Donovan, Aseem Agarwala, and Aaron Hertzmann. 2014. Learning layouts for single-pagegraphic designs. IEEE transactions on visualization and computer graphics 20, 8 (2014), 1200–1213.
    53. Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, and Dani Lischinski. 2021. Styleclip: Text-driven manipulation of stylegan imagery. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2085–2094.
    54. Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748–8763.
    55. Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125 (2022).
    56. Erik Reinhard, Michael Adhikhmin, Bruce Gooch, and Peter Shirley. 2001. Color transfer between images. IEEE Computer graphics and applications 21, 5 (2001), 34–41.
    57. Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S Sara Mahdavi, Rapha Gontijo Lopes, et al. 2022. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. arXiv preprint arXiv:2205.11487 (2022).
    58. Jianchao Tan, Jose Echevarria, and Yotam Gingold. 2018. Efficient palette-based decomposition and recoloring of images via RGBXY-space geometry. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1–10.
    59. Jianchao Tan, Jyh-Ming Lien, and Yotam Gingold. 2016. Decomposing images into layers via RGB-space geometry. ACM Transactions on Graphics (TOG) 36, 1 (2016), 1–14.
    60. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
    61. Yili Wang, Yifan Liu, and Kun Xu. 2019. An Improved Geometric Approach for Palette-based Image Decomposition and Recoloring. In Computer Graphics Forum, Vol. 38. Wiley Online Library, 11–22.
    62. Yi Wang, Menghan Xia, Lu Qi, Jing Shao, and Yu Qiao. 2022. PalGAN: Image Colorization with Palette Generative Adversarial Networks. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XV. Springer, 271–288.
    63. Shuchen Weng, Hao Wu, Zheng Chang, Jiajun Tang, Si Li, and Boxin Shi. 2022. L-CoDe:Language-Based Colorization Using Color-Object Decoupled Conditions. Proceedings of the AAAI Conference on Artificial Intelligence 36, 3 (2022), 2677–2684.
    64. Chloe West. 2020. The Ultimate Guide to Flyer Design. Retrieved May 02, 2022 from https://visme.co/blog/flyer-design/
    65. Geoff Woolfe et al. 2007. Natural language color editing. In ISCC Annual Meeting. Citeseer.
    66. Chenyun Wu, Zhe Lin, Scott Cohen, Trung Bui, and Subhransu Maji. 2020. Phrasecut: Language-based image segmentation in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10216–10225.
    67. Weihao Xia, Yujiu Yang, Jing-Hao Xue, and Baoyuan Wu. 2021. Tedigan: Text-guided diverse face image generation and manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2256–2265.
    68. Chufeng Xiao, Deng Yu, Xiaoguang Han, Youyi Zheng, and Hongbo Fu. 2021. Sketch-HairSalon: Deep Sketch-based Hair Image Synthesis. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH Asia 2021) 40, 6 (2021), 1–16.
    69. Kun Xu, Yong Li, Tao Ju, Shi-Min Hu, and Tian-Qiang Liu. 2009. Efficient affinity-based edit propagation using kd tree. ACM Transactions on Graphics (TOG) 28, 5 (2009), 1–6.
    70. Liron Yatziv and Guillermo Sapiro. 2006. Fast image and video colorization using chrominance blending. IEEE transactions on image processing 15, 5 (2006), 1120–1129.
    71. Sergey Zagoruyko and Nikos Komodakis. 2016. Wide residual networks. arXiv preprint arXiv:1605.07146 (2016).
    72. Qing Zhang, Chunxia Xiao, Hanqiu Sun, and Feng Tang. 2017a. Palette-based image recoloring using color decomposition optimization. IEEE Transactions on Image Processing 26, 4 (2017), 1952–1964.
    73. Richard Zhang, Jun-Yan Zhu, Phillip Isola, Xinyang Geng, Angela S Lin, Tianhe Yu, and Alexei A Efros. 2017b. Real-time user-guided image colorization with learned deep priors. arXiv preprint arXiv:1705.02999 (2017).
    74. Nanxuan Zhao, Quanlong Zheng, Jing Liao, Ying Cao, Hanspeter Pfister, and Rynson WH Lau. 2021. Selective Region-based Photo Color Adjustment for Graphic Designs. ACM Transactions on Graphics (TOG) 40, 2 (2021), 1–16.
    75. Shizhan Zhu, Raquel Urtasun, Sanja Fidler, Dahua Lin, and Chen Change Loy. 2017. Be your own prada: Fashion synthesis with structural coherence. In Proceedings of the IEEE international conference on computer vision. 1680–1688.
    76. Changqing Zou, Haoran Mo, Chengying Gao, Ruofei Du, and Hongbo Fu. 2019. Language-based colorization of scene sketches. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1–16.

ACM Digital Library Publication:

Overview Page: