“Rewriting geometric rules of a GAN” by Wang, Bau and Zhu

Next: “RGBW and LED panels as a lighting source”... »

« Previous: “Revolution – Evolution: The Collaboration...

Conference:

SIGGRAPH 2022
More from SIGGRAPH 2022:

Type(s):

Studio (SIGGRAPH Labs)

Title:

Rewriting geometric rules of a GAN

Program Title:

Labs Demo

Presenter(s):

Description:

Deep generative models make visual content creation more accessible to novice users by automating the synthesis of diverse, realistic content based on a collected dataset. However, the current machine learning approaches miss a key element of the creative process – the ability to synthesize things that go far beyond the data distribution and everyday experience. To begin to address this issue, we enable a user to “warp” a given model by editing just a handful of original model outputs with desired geometric changes. Our method applies a low-rank update to a single model layer to reconstruct edited examples. Furthermore, to combat overfitting, we propose a latent space augmentation method based on style-mixing. Our method allows a user to create a model that synthesizes endless objects with defined geometric changes, enabling the creation of a new generative model without the burden of curating a large-scale dataset. We also demonstrate that edited models can be composed to achieve aggregated effects, and we present an interactive interface to enable users to create new models through composition. Empirical measurements on multiple test cases suggest the advantage of our method against recent GAN fine-tuning methods. Finally, we showcase several applications using the edited models, including latent space interpolation and image editing.

References:

Rameen Abdal, Yipeng Qin, and Peter Wonka. 2020. Image2StyleGAN++: How to Edit the Embedded Images?. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Rameen Abdal, Peihao Zhu, Niloy J Mitra, and Peter Wonka. 2021. StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows. ACM Transactions on Graphics (TOG) (2021).
Kfr Aberman, Jing Liao, Mingyi Shi, Dani Lischinski, Baoquan Chen, and Daniel Cohen-Or. 2018. Neural Best-Buddies: Sparse Cross-Domain Correspondence. ACM Transactions on Graphics (TOG) (2018).
Yuval Alaluf, Or Patashnik, and Daniel Cohen-Or. 2021. Only a Matter of Style: Age Transformation Using a Style-Based Regression Model. ACM Transactions on Graphics (TOG) (2021).
Badour Albahar, Jingwan Lu, Jimei Yang, Zhixin Shu, Eli Shechtman, and Jia-Bin Huang. 2021. Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN. ACM Transactions on Graphics (TOG) (2021).
Marc Alexa, Daniel Cohen-Or, and David Levin. 2000. As-Rigid-As-Possible Shape Interpolation. 157–164.
Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. In ACM SIGGRAPH.
Harry G Barrow, Jay M Tenenbaum, Robert C Bolles, and Helen C Wolf. 1977. Parametric correspondence and chamfer matching: Two new techniques for image matching. Technical Report. SRI International Artificial Intelligence Center.
David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, and Antonio Torralba. 2020. Rewriting a deep generative model. In European Conference on Computer Vision (ECCV).
Thaddeus Beier and Shawn Neely. 1992. Feature-Based Image Metamorphosis. ACM Transactions on Graphics (TOG) (1992).
Andrew Brock, Jef Donahue, and Karen Simonyan. 2019. Large scale gan training for high fidelity natural image synthesis. In International Conference on Learning Representations (ICLR).
Matthew Brown and David G Lowe. 2007. Automatic Panoramic Image Stitching using Invariant Features. International Journal of Computer Vision (IJCV) (2007).
James Cameron and Jon Landau. 2009. Avatar.
Kaidi Cao, Jing Liao, and Lu Yuan. 2018. CariGANs: Unpaired Photo-to-Caricature Translation. ACM Transactions on Graphics (TOG) (2018).
Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A Efros. 2019. Everybody Dance Now. In IEEE International Conference on Computer Vision (ICCV).
Seokju Cho, Sunghwan Hong, Sangryul Jeon, Yunsung Lee, Kwanghoon Sohn, and Seungryong Kim. 2021. CATs: Cost Aggregation Transformers for Visual Correspondence. In Advances in Neural Information Processing Systems (NeurIPS).
Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. StarGAN v2: Diverse Image Synthesis for Multiple Domains. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. 2017. Density estimation using Real NVP. In International Conference on Learning Representations (ICLR).
Ian Failes. 2016. Masters of FX: Behind the Scenes with Geniuses of Visual and Special Effects. CRC Press.
Martin A Fischler and Robert C Bolles. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM (1981).
Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. 2020. Leveraging frequency analysis for deep fake image recognition. In International Conference on Machine Learning (ICML).
Raghudeep Gadde, Qianli Feng, and Aleix M. Martinez. 2021. Detail Me More: Improving GAN’s photo-realism of complex scenes. In IEEE International Conference on Computer Vision (ICCV).
Rinon Gal, Dana Cohen, Amit Bermano, and Daniel Cohen-Or. 2021. SWAGAN: A Style-based Wavelet-driven Generative Model. ACM Transactions on Graphics (TOG) (2021).
Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, and Daniel Cohen-Or. 2022. StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators. ACM Transactions on Graphics (TOG) (2022).
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems (NeurIPS).
Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. 2020. GANSpace: Discovering Interpretable GAN Controls. In Advances in Neural Information Processing Systems (NeurIPS).
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems (NeurIPS).
Minyoung Huh, Jun-Yan Zhu Richard Zhang, Sylvain Paris, and Aaron Hertzmann. 2020. Transforming and Projecting Images to Class-conditional Generative Networks. In European Conference on Computer Vision (ECCV).
Takeo Igarashi, Tomer Moscovich, and John F Hughes. 2005. As-Rigid-As-Possible Shape Manipulation. ACM Transactions on Graphics (TOG) (2005).
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wonjong Jang, Gwangjin Ju, Yucheol Jung, Jiaolong Yang, Xin Tong, and Seungyong Lee. 2021. StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation. ACM Transactions on Graphics (TOG) (2021).
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive growing of gans for improved quality, stability, and variation. In International Conference on Learning Representations (ICLR).
Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2020a. Training Generative Adversarial Networks with Limited Data. In Advances in Neural Information Processing Systems (NeurIPS).
Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2021. Alias-Free Generative Adversarial Networks. In Advances in Neural Information Processing Systems (NeurIPS).
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020b. Analyzing and improving the image quality of stylegan. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020).
Byungmoon Kim, Daichi Ito, and Gahye Park. 2019. Facial feature liquifying using face mesh. US Patent 10,223,767.
Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR).
Diederik P Kingma and Prafulla Dhariwal. 2018. Glow: Generative flow with invertible 1×1 convolutions. In Advances in Neural Information Processing Systems (NeurIPS).
Diederik P Kingma and Max Welling. 2014. Auto-encoding variational bayes. International Conference on Learning Representations (ICLR) (2014).
Nupur Kumari, Richard Zhang, Eli Shechtman, and Jun-Yan Zhu. 2022. Ensembling Of-the-shelf Models for GAN Training. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Kathleen M Lewis, Srivatsan Varadharajan, and Ira Kemelmacher-Shlizerman. 2021. TryOnGAN: Body-Aware Try-On via Layered Interpolation. ACM Transactions on Graphics (TOG) (2021).
Yijun Li, Richard Zhang, Jingwan Lu, and Eli Shechtman. 2020. Few-shot Image Generation with Elastic Weight Consolidation. In Advances in Neural Information Processing Systems (NeurIPS).
Jing Liao, Yuan Yao, Lu Yuan, Gang Hua, and Sing Bing Kang. 2017. Visual Attribute Transfer Through Deep Image Analogy. ACM Transactions on Graphics (TOG) 36, 4 (July 2017).
Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, and Sanja Fidler. 2021. EditGAN: High-Precision Semantic Image Editing. In Advances in Neural Information Processing Systems (NeurIPS).
Ce Liu, Jenny Yuen, and Antonio Torralba. 2010. SIFT Flow: Dense Correspondence across Scenes and Its Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2010).
Bruce D Lucas, Takeo Kanade, et al. 1981. An Iterative Image Registration Technique with an Application to Stereo Vision. In International Joint Conference on Artificial Intelligence (IJCAI).
George Lucas and Gary Kurtz. 1977. Star Wars.
Sangwoo Mo, Minsu Cho, and Jinwoo Shin. 2020. Freeze the Discriminator: a Simple Baseline for Fine-Tuning GANs. In CVPR Workshop.
Atsuhiro Noguchi and Tatsuya Harada. 2019. Image generation from small datasets via batch statistics adaptation. In IEEE International Conference on Computer Vision (ICCV).
Utkarsh Ojha, Yijun Li, Cynthia Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, and Richard Zhang. 2021. Few-shot Image Generation via Cross-domain Correspondence. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016. Conditional image generation with PixelCNN decoders. In Advances in Neural Information Processing Systems (NeurIPS).
Roy Or-El, Soumyadip Sengupta, Ohad Fried, Eli Shechtman, and Ira Kemelmacher-Shlizerman. 2020. Lifespan Age Transformation Synthesis. In European Conference on Computer Vision (ECCV).
Xingang Pan, Xiaohang Zhan, Bo Dai, Dahua Lin, Chen Change Loy, and Ping Luo. 2020. Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation. In European Conference on Computer Vision (ECCV).
Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. Semantic Image Synthesis with Spatially-Adaptive Normalization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, and Dani Lischinski. 2021a. Styleclip: Text-driven manipulation of stylegan imagery. In IEEE International Conference on Computer Vision (ICCV).
Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, and Dani Lischinski. 2021b. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery. In IEEE International Conference on Computer Vision (ICCV).
Tiziano Portenier, Qiyang Hu, Attila Szabó, Siavash Arjomand Bigdeli, Paolo Favaro, and Matthias Zwicker. 2018. Faceshop: Deep Sketch-Based Face Image Editing. ACM Transactions on Graphics (TOG) 37, 4 (2018).
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (ICML).
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-Shot Text-to-Image Generation. In International Conference on Machine Learning (ICML).
Ali Razavi, Aaron van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with vq-vae-2. In Advances in Neural Information Processing Systems (NeurIPS).
Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2021a. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2021b. Encoding in style: a stylegan encoder for image-to-image translation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. FaceForensics++: Learning to Detect Manipulated Facial Images. In IEEE International Conference on Computer Vision (ICCV).
Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, and Aleksander Madry. 2021. Editing a classifer by rewriting its prediction rules. In Advances in Neural Information Processing Systems (NeurIPS).
Axel Sauer, Kashyap Chitta, Jens Müller, and Andreas Geiger. 2021. Projected GANs Converge Faster. In Advances in Neural Information Processing Systems (NeurIPS).
Scott Schaefer, Travis McPhail, and Joe Warren. 2006. Image Deformation Using Moving Least Squares. ACM Transactions on Graphics (TOG) 25, 3 (2006).
Deb Debayan Shi, Yichun and Anil K. Jain. 2019. WarpGAN: Automatic Caricature Generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
YiChang Shih, Sylvain Paris, Connelly Barnes, William T Freeman, and Frédo Durand. 2014. Style Transfer for Headshot Portraits. ACM Transactions on Graphics (TOG) (2014).
Yichang Shih, Sylvain Paris, Frédo Durand, and William T Freeman. 2013. Data-driven hallucination of different times of day from a single outdoor photo. ACM Transactions on Graphics (TOG) 32, 6 (2013), 200.
X. Soria, E. Riba, and A. Sappa. 2020. Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection. In Winter Conference on Applications of Computer Vision.
Diana Sungatullina, Egor Zakharov, Dmitry Ulyanov, and Victor S. Lempitsky. 2021. Image Manipulation with Perceptual Discriminators. In European Conference on Computer Vision (ECCV).
Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Lu Yuan, Sergey Tulyakov, and Nenghai Yu. 2020. MichiGAN: Multi-input-conditioned hair image generation for portrait editing. arXiv preprint arXiv:2010.16417 (2020).
Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, and Daniel Cohen-Or. 2021. Designing an Encoder for StyleGAN Image Manipulation. ACM Transactions on Graphics (TOG) (2021).
Ngoc-Trung Tran, Viet-Hung Tran, Ngoc-Bao Nguyen, Trung-Kien Nguyen, and Ngai-Man Cheung. 2020. Towards good practices for data augmentation in gan training. arXiv preprint arXiv:2006.05338 2 (2020).
Hung-Yu Tseng, Lu Jiang, Ce Liu, Ming-Hsuan Yang, and Weilong Yang. 2021. Regularing Generative Adversarial Networks under Limited Data. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Sheng-Yu Wang, David Bau, and Jun-Yan Zhu. 2021. Sketch Your Own GAN. In IEEE International Conference on Computer Vision (ICCV).
Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. 2020b. CNN-generated images are surprisingly easy to spot… for now. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018a. Video-to-Video Synthesis. In Advances in Neural Information Processing Systems (NeurIPS).
Yaxing Wang, Abel Gonzalez-Garcia, David Berga, Luis Herranz, Fahad Shahbaz Khan, and Joost van de Weijer. 2020a. MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Yaxing Wang, Chenshen Wu, Luis Herranz, Joost van de Weijer, Abel Gonzalez-Garcia, and Bogdan Raducanu. 2018b. Transferring gans: generating images from limited data. In European Conference on Computer Vision (ECCV).
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing (TIP) 13, 4 (2004), 600–612.
Simon N Wood. 2003. Thin Plate Regression Splines. Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2003).
Fisher Yu, Ari Sef, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015).
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Xu Zhang, Svebor Karaman, and Shih-Fu Chang. 2019. Detecting and Simulating Artifacts in GAN Fake Images. In IEEE International Workshop on Information Forensics and Security (WIFS).
Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, and Sanja Fidler. 2021. Image {GAN}s meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering. In International Conference on Learning Representations (ICLR).
Miaoyun Zhao, Yulai Cong, and Lawrence Carin. 2020a. On leveraging pretrained GANs for generation with limited data. In International Conference on Machine Learning (ICML).
Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, and Song Han. 2020b. Differentiable Augmentation for Data-Efficient GAN Training. In Advances in Neural Information Processing Systems (NeurIPS).
Zhengli Zhao, Zizhao Zhang, Ting Chen, Sameer Singh, and Han Zhang. 2020c. Image augmentations for GAN training. arXiv preprint arXiv:2006.02595 (2020).
Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2017).
Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A Efros. 2016. Generative visual manipulation on the natural image manifold. In European Conference on Computer Vision (ECCV).

ACM Digital Library Publication:

Rewriting geometric rules of a GAN

Overview Page:

SIGGRAPH 2022: Labs