Semantic photo manipulation with a generative image prior

Despite the recent success of GANs in synthesizing images conditioned on inputs such as a user sketch, text, or semantic labels, manipulating the high-level attributes of an existing natural photograph with GANs is challenging for two reasons. First, it is hard for GANs to precisely reproduce an input image. Second, after manipulation, the newly synthesized pixels often do not fit the original image. In this paper, we address these issues by adapting the image prior learned by GANs to image statistics of an individual image. Our method can accurately reconstruct the input image and synthesize new content, consistent with the appearance of the input image. We demonstrate our interactive system on several semantic image editing tasks, including synthesizing new objects consistent with background, removing unwanted objects, and changing the appearance of an object. Quantitative and qualitative comparisons against several existing methods demonstrate the effectiveness of our method.

References:

1. Xiaobo An and Fabio Pellacini. 2008. AppProp: all-pairs appearance-space edit propagation. In ACM Transactions on Graphics (TOG), Vol. 27. ACM, 40. Google ScholarDigital Library
2. Shai Avidan and Ariel Shamir. 2007. Seam carving for content-aware image resizing. In ACM Transactions on graphics (TOG), Vol. 26. ACM, 10. Google ScholarDigital Library
3. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. 2009. Patch-Match: A randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics (ToG) 28, 3 (2009), 24. Google ScholarDigital Library
4. David Bau, Jun-Yan Zhu, Hendrik Strobelt, Zhou Bolei, Joshua B. Tenenbaum, William T. Freeman, and Antonio Torralba. 2019. GAN Dissection: Visualizing and Understanding Generative Adversarial Networks. In ICLR.Google Scholar
5. Andrew Brock, Jeff Donahue, and Karen Simonyan. 2019. Large scale gan training for high fidelity natural image synthesis. (2019).Google Scholar
6. Andrew Brock, Theodore Lim, James M Ritchie, and Nick Weston. 2017. Neural photo editing with introspective adversarial networks. In ICLR.Google Scholar
7. Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In NIPS. Google ScholarDigital Library
8. Alexey Dosovitskiy and Thomas Brox. 2016. Generating images with perceptual similarity metrics based on deep networks. In NIPS. Google ScholarDigital Library
9. Frédo Durand and Julie Dorsey. 2002. Fast bilateral filtering for the display of high-dynamic-range images. In ACM transactions on graphics (TOG), Vol. 21. ACM, 257–266. Google ScholarDigital Library
10. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image Style Transfer Using Convolutional Neural Networks. CVPR (2016).Google Scholar
11. Jiahao Geng, Tianjia Shao, Youyi Zheng, Yanlin Weng, and Kun Zhou. 2018. Warp-guided GANs for single-photo facial animation. In SIGGRAPH Asia. 231. Google ScholarDigital Library
12. Michaël Gharbi, Jiawen Chen, Jonathan T Barron, Samuel W Hasinoff, and Frédo Durand. 2017. Deep bilateral learning for real-time image enhancement. ACM Transactions on Graphics (TOG) 36, 4 (2017), 118. Google ScholarDigital Library
13. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS. Google ScholarDigital Library
14. Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification. ACM TOG 35, 4 (2016). Google ScholarDigital Library
15. Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2017. Globally and locally consistent image completion. ACM Transactions on Graphics (TOG) 36, 4 (2017), 107. Google ScholarDigital Library
16. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR.Google Scholar
17. Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive growing of gans for improved quality, stability, and variation. In ICLR.Google Scholar
18. Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In CVPR.Google Scholar
19. Kevin Karsch, Varsha Hedau, David Forsyth, and Derek Hoiem. 2011. Rendering synthetic objects into legacy photographs. ACM Transactions on Graphics (TOG) 30, 6 (2011), 157. Google ScholarDigital Library
20. Natasha Kholgade, Tomas Simon, Alexei Efros, and Yaser Sheikh. 2014. 3D object manipulation in a single photograph using stock 3D models. ACM Transactions on Graphics (TOG) 33, 4 (2014), 127. Google ScholarDigital Library
21. Tae-Hoon Kim and Sang Il Park. 2018. Deep context-aware descreening and rescreening of halftone images. ACM Transactions on Graphics (TOG) 37, 4 (2018), 48. Google ScholarDigital Library
22. Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In ICLR.Google Scholar
23. Diederik P Kingma and Max Welling. 2014. Auto-encoding variational bayes. ICLR (2014).Google Scholar
24. Jean-François Lalonde, Derek Hoiem, Alexei A Efros, Carsten Rother, John Winn, and Antonio Criminisi. 2007. Photo clip art. ACM transactions on graphics (TOG) 26, 3 (2007), 3. Google ScholarDigital Library
25. Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization using optimization. In ACM transactions on graphics (tog), Vol. 23. ACM, 689–694. Google ScholarDigital Library
26. Yijun Li, Ming-Yu Liu, Xueting Li, Ming-Hsuan Yang, and Jan Kautz. 2018. A closed-form solution to photorealistic image stylization. In ECCV.Google Scholar
27. Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. In ICLR.Google Scholar
28. Koki Nagano, Jaewoo Seo, Jun Xing, Lingyu Wei, Zimo Li, Shunsuke Saito, Aviral Agarwal, Jens Fursund, Hao Li, Richard Roberts, and others. 2018. paGAN: real-time avatars using dynamic textures. In SIGGRAPH Asia. 258. Google ScholarDigital Library
29. Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. Semantic Image Synthesis with Spatially-Adaptive Normalization. In CVPR.Google Scholar
30. Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A Efros. 2016. Context Encoders:Feature Learning by Inpainting. CVPR (2016).Google Scholar
31. Guim Perarnau, Joost van de Weijer, Bogdan Raducanu, and Jose M Álvarez. 2016. Invertible conditional gans for image editing. In NIPS Workshop on Adversarial Training.Google Scholar
32. Patrick Pérez, Michel Gangnet, and Andrew Blake. 2003. Poisson image editing. ACM Transactions on graphics (TOG) 22, 3 (2003), 313–318. Google ScholarDigital Library
33. Tiziano Portenier, Qiyang Hu, Attila Szabó, Siavash Arjomand Bigdeli, Paolo Favaro, and Matthias Zwicker. 2018. Faceshop: Deep Sketch-based Face Image Editing. ACM Transactions on Graphics (TOG) 37, 4 (July 2018), 99:1–99:13. Google ScholarDigital Library
34. Erik Reinhard, Michael Adhikhmin, Bruce Gooch, and Peter Shirley. 2001. Color transfer between images. IEEE Computer graphics and applications 21, 5 (2001), 34–41. Google ScholarDigital Library
35. Patsorn Sangkloy, Jingwan Lu, Chen Fang, Fisher Yu, and James Hays. 2017. Scribbler: Controlling Deep Image Synthesis with Sketch and Color. In CVPR.Google Scholar
36. Assaf Shocher, Nadav Cohen, and Michal Irani. 2018. “Zero-Shot” Super-Resolution using Deep Internal Learning. In CVPR.Google Scholar
37. Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR.Google Scholar
38. Michael W Tao, Micah K Johnson, and Sylvain Paris. 2010. Error-tolerant image compositing. In ECCV. Google ScholarDigital Library
39. Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2018. Deep image prior. In CVPR.Google Scholar
40. Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. In CVPR.Google Scholar
41. Su Xue, Aseem Agarwala, Julie Dorsey, and Holly Rushmeier. 2012. Understanding and improving the realism of image composites. ACM Transactions on Graphics (TOG) 31, 4 (2012), 84. Google ScholarDigital Library
42. Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015).Google Scholar
43. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. 2018. Generative Image Inpainting With Contextual Attention. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
44. Edward Zhang, Michael F Cohen, and Brian Curless. 2016a. Emptying, refurnishing, and relighting indoor spaces. ACM Transactions on Graphics (TOG) 35, 6 (2016), 174. Google ScholarDigital Library
45. Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris Metaxas. 2017a. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. In ICCV.Google Scholar
46. Richard Zhang, Phillip Isola, and Alexei A Efros. 2016b. Colorful Image Colorization. In ECCV.Google Scholar
47. Richard Zhang, Jun-Yan Zhu, Phillip Isola, Xinyang Geng, Angela S Lin, Tianhe Yu, and Alexei A Efros. 2017b. Real-Time User-Guided Image Colorization with Learned Deep Priors. ACM Transactions on Graphics (TOG) 9, 4 (2017). Google ScholarDigital Library
48. Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A. Efros. 2016. Generative Visual Manipulation on the Natural Image Manifold. In ECCV.Google Scholar
49. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In ICCV.Google Scholar

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2019: Technical Papers

“Semantic photo manipulation with a generative image prior” by Bau, Strobelt, Peebles, Wulff, Zhou, et al. …

Conference:

Type(s):

Title:

Session/Category Title: Photo Science

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: