“Only a matter of style: age transformation using a style-based regression model” by Alaluf, Patashnik and Cohen-Or

  • ©Yuval Alaluf, Or Patashnik, and Daniel Cohen-Or

Conference:


Type(s):


Title:

    Only a matter of style: age transformation using a style-based regression model

Presenter(s)/Author(s):



Abstract:


    The task of age transformation illustrates the change of an individual’s appearance over time. Accurately modeling this complex transformation over an input facial image is extremely challenging as it requires making convincing, possibly large changes to facial features and head shape, while still preserving the input identity. In this work, we present an image-to-image translation method that learns to directly encode real facial images into the latent space of a pre-trained unconditional GAN (e.g., StyleGAN) subject to a given aging shift. We employ a pre-trained age regression network to explicitly guide the encoder in generating the latent codes corresponding to the desired age. In this formulation, our method approaches the continuous aging process as a regression task between the input age and desired target age, providing fine-grained control over the generated image. Moreover, unlike approaches that operate solely in the latent space using a prior on the path controlling age, our method learns a more disentangled, non-linear path. Finally, we demonstrate that the end-to-end nature of our approach, coupled with the rich semantic latent space of StyleGAN, allows for further editing of the generated images. Qualitative and quantitative evaluations show the advantages of our method compared to state-of-the-art approaches. Code is available at our project page: https://yuval-alaluf.github.io/SAM.

References:


    1. Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2stylegan: How to embed images into the stylegan latent space?. In Proceedings of the IEEE international conference on computer vision. 4432–4441.Google ScholarCross Ref
    2. Rameen Abdal, Yipeng Qin, and Peter Wonka. 2020a. Image2StyleGAN++: How to Edit the Embedded Images?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8296–8305.Google ScholarCross Ref
    3. Rameen Abdal, Peihao Zhu, Niloy Mitra, and Peter Wonka. 2020b. StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows. arXiv:2008.02401 [cs.CV]Google Scholar
    4. Grigory Antipov, Moez Baccouche, and Jean-Luc Dugelay. 2017. Face Aging With Conditional Generative Adversarial Networks. arXiv:1702.01983 [cs.CV]Google Scholar
    5. David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, and Antonio Torralba. 2019. Semantic Photo Manipulation with a Generative Image Prior. ACM Trans. Graph. 38, 4, Article 59 (July 2019), 11 pages. Google ScholarDigital Library
    6. John Bauld. 2019. Image taken by John Bauld and can be found here. License: Attribution 2.0 Generic (CC BY 2.0).Google Scholar
    7. Baylies. 2019. stylegan-encoder. https://github.com/pbaylies/stylegan-encoder. Accessed: January 2021.Google Scholar
    8. Georges Biard. 2016. (2016). Image taken by Georges Biard and can be found here. License: Attribution-Share Alike 3.0 Unported (CC BY-SA 3.0).Google Scholar
    9. A. M. Burton, R. S. Kramer, K. L. Ritchie, and R. Jenkins. 2016. Identity From Variation: Representations of Faces Derived From Multiple Instances. Cogn Sci 40, 1 (Jan 2016), 202–223.Google ScholarCross Ref
    10. Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8789–8797.Google ScholarCross Ref
    11. Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8188–8197.Google ScholarCross Ref
    12. Ruth Clutterbuck and Robert A Johnston. 2002. Exploring Levels of Face Familiarity by Using an Indirect Face-Matching Measure. Perception 31, 8 (2002), 985–994. arXiv:https://doi.org/10.1068/p3335 PMID: 12269591. Google ScholarCross Ref
    13. Edo Collins, Raja Bala, Bob Price, and Sabine Süsstrunk. 2020. Editing in Style: Uncovering the Local Semantics of GANs. arXiv:2004.14367 [cs.CV]Google Scholar
    14. Antonia Creswell and Anil Anthony Bharath. 2018. Inverting the generator of a generative adversarial network. IEEE transactions on neural networks and learning systems 30, 7 (2018), 1967–1974.Google Scholar
    15. Gorup de Besanez. 1990. Image taken by Gorup de Besanez and can be found here. License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).Google Scholar
    16. Jaqueline de Souza. 2019. Image taken by Jaqueline de Souza and can be found here. License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).Google Scholar
    17. Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4690–4699.Google ScholarCross Ref
    18. Emily Denton, Ben Hutchinson, Margaret Mitchell, and Timnit Gebru. 2019. Detecting bias with generative counterfactual face attribute augmentation. arXiv preprint arXiv:1906.06439 (2019).Google Scholar
    19. Chi Nhan Duong, Khoa Luu, Kha Gia Quach, and Tien D. Bui. 2018. Longitudinal Face Aging in the Wild – Recent Deep Learning Approaches. arXiv:1802.08726 [cs.CV]Google Scholar
    20. H. Fang, W. Deng, Y. Zhong, and J. Hu. 2020. Triple-GAN: Progressive Face Aging with Triple Translation Loss. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 3500–3509. Google ScholarCross Ref
    21. Yun Fu, Guodong Guo, and Thomas Huang. 2010. Age Synthesis and Estimation via Faces: A Survey. IEEE transactions on pattern analysis and machine intelligence 32 (11 2010), 1955–76. Google ScholarDigital Library
    22. Angela George. 2012. Image taken by Angela George and can be found here. License: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0).Google Scholar
    23. Markos Georgopoulos, James Oldfield, Mihalis A. Nicolaou, Yannis Panagakis, and Maja Pantic. 2020. Enhancing Facial Data Diversity With Style-Based Face Aging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.Google ScholarCross Ref
    24. Lore Goetschalckx, Alex Andonian, Aude Oliva, and Phillip Isola. 2019. GANalyze: Toward Visual Definitions of Cognitive Image Properties. arXiv:1906.10112 [cs.CV]Google Scholar
    25. Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. 2020. GANSpace: Discovering Interpretable GAN Controls. arXiv preprint arXiv:2004.02546 (2020).Google Scholar
    26. Z. He, M. Kan, S. Shan, and X. Chen. 2019. S2GAN: Share Aging Factors Across Ages and Share Aging Trends Among Individuals. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 9439–9448. Google ScholarCross Ref
    27. Zhenliang He, Wangmeng Zuo, Meina Kan, Shiguang Shan, and Xilin Chen. 2018. AttGAN: Facial Attribute Editing by Only Changing What You Want. arXiv:1711.10678 [cs.CV]Google Scholar
    28. Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. 2018. Multimodal Unsupervised Image-to-image Translation. In ECCV.Google Scholar
    29. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2018. Image-to-Image Translation with Conditional Adversarial Networks. arXiv:1611.07004 [cs.CV]Google Scholar
    30. Rob Jenkins, David White, Xandra Van Montfort, and A. Mike Burton. 2011. Variability in photos of the same face. Cognition 121, 3 (2011), 313 — 323. Google ScholarCross Ref
    31. Robert A.Johnston, Masami Kanazawa, Takashi Kato, and Masaomi Oda. 1997. Exploring the Structure of Multidimensional Face-space: The Effects of Age and Gender. Visual Cognition 4, 1 (1997), 39–57. arXiv:https://doi.org/10.1080/713756750 Google ScholarCross Ref
    32. Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).Google Scholar
    33. Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4401–4410.Google ScholarCross Ref
    34. Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8110–8119.Google ScholarCross Ref
    35. Korush and Millie. 2020. Image taken by Korush and Millie and can be found here. License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).Google Scholar
    36. Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Fader Networks: Manipulating Images by Sliding Attributes. arXiv:1706.00409 [cs.CV]Google Scholar
    37. Hsin-Ying Lee, Hung-Yu Tseng, Qi Mao, Jia-Bin Huang, Yu-Ding Lu, Maneesh Kumar Singh, and Ming-Hsuan Yang. 2020. DRIT++: Diverse Image-to-Image Translation viaDisentangled Representations. International Journal of Computer Vision (2020), 1–16.Google Scholar
    38. Peipei Li, Huaibo Huang, Yibo Hu, Xiang Wu, Ran He, and Zhenan Sun. 2019. UVA: A Universal Variational Framework for Continuous Age Analysis. arXiv:1904.00158 [cs.CV]Google Scholar
    39. Alan Light. 1989. Image taken by Alan Light and can be found here. License: Attribution 2.0 Generic (CC BY 2.0).Google Scholar
    40. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117–2125.Google ScholarCross Ref
    41. Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised Image-to-Image Translation Networks. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc., 700–708. https://proceedings.neurips.cc/paper/2017/file/dc6a6489640ca02b0d42dabeb8e46bb7-Paper.pdfGoogle Scholar
    42. Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, and Jan Kautz. 2019. Few-Shot Unsupervised Image-to-Image Translation. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
    43. Yunfan Liu, Qi Li, Zhenan Sun, and Tieniu Tan. 2020. Style Intervention: How to Achieve Spatial Disentanglement with Style-based Generators? arXiv:2011.09699 [cs.CV]Google Scholar
    44. Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV).Google ScholarDigital Library
    45. Ahmed M. Megreya and A. Mike Burton. 2006. Unfamiliar faces are not faces: Evidence from a matching task. Memory & Cognition 34, 4 (01 Jun 2006), 865–876. Google ScholarCross Ref
    46. A. M. Megreya and A. M. Burton. 2008. Matching faces to photographs: Poor performance in eyewitness memory (without the memory). Journal of Experimental Psychology: Applied, 14(4) (2008), 364–372. https://doi.org/0.1037/a0013464Google Scholar
    47. Mila Mileva, Andrew W. Young, Rob Jenkins, and A. Mike Burton. 2020. Facial identity across the lifespan. Cognitive Psychology 116 (2020), 101260. Google ScholarCross Ref
    48. Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. arXiv:1411.1784 [cs.LG]Google Scholar
    49. Yotam Nitzan, Amit Bermano, Yangyan Li, and Daniel Cohen-Or. 2020. Face Identity Disentanglement via Latent Space Mapping. ACM Trans. Graph. 39, 6, Article 225 (Nov. 2020), 14 pages. Google ScholarDigital Library
    50. Roy Or-El, Soumyadip Sengupta, Ohad Fried, Eli Shechtman, and Ira Kemelmacher-Shlizerman. 2020. Lifespan Age Transformation Synthesis. arXiv:2003.09764 [cs.CV]Google Scholar
    51. Yibo Hu Xiang Wu Ran He Zhenan Sun Peipei Li, Huaibo Huang. 2020. Hierarchical Face Aging through Disentangled Latent Characteristics. ECCV (2020).Google Scholar
    52. Stanislav Pidhorskyi, Donald A Adjeroh, and Gianfranco Doretto. 2020. Adversarial Latent Autoencoders. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14104–14113.Google ScholarCross Ref
    53. Narayanan Ramanathan, Rama Chellappa, and Soma Biswas. 2009. Computational methods for modeling facial aging: A survey. Journal of Visual Languages Computing 20, 3 (2009), 131 — 144. ADVANCES IN MULTIMODAL BIOMETRIC SYSTEMS.Google ScholarDigital Library
    54. Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2020. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. arXiv:2008.00951 [cs.CV]Google Scholar
    55. Rasmus Rothe, Radu Timofte, and Luc Van Gool. 2015. DEX: Deep EXpectation of apparent age from a single image. In IEEE International Conference on Computer Vision Workshops (ICCVW).Google ScholarDigital Library
    56. Rasmus Rothe, Radu Timofte, and Luc Van Gool. 2018. Deep expectation of real and apparent age from a single image without facial landmarks. International Journal of Computer Vision 126, 2-4 (2018), 144–157.Google ScholarDigital Library
    57. David Shankbone. 2008. Image taken by David Shankbone and can be found here. License: Attribution-Share Alike 3.0 Unported.Google Scholar
    58. David Shankbone. 2010. Image taken by David Shankbone and can be found here. License: Attribution 3.0 Unported (CC BY 3.0).Google Scholar
    59. Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. 2020. Interpreting the latent space of gans for semantic face editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9243–9252.Google ScholarCross Ref
    60. Yujun Shen and Bolei Zhou. 2020. Closed-Form Factorization of Latent Semantics in GANs. arXiv preprint arXiv:2007.06600 (2020).Google Scholar
    61. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs.CV]Google Scholar
    62. Hao Tang, Hong Liu, Dan Xu, Philip H. S. Torr, and Nicu Sebe. 2020. AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Networks. arXiv:1911.11897 [cs.CV]Google Scholar
    63. X. Tang, Z. Wang, W. Luo, and S. Gao. 2018. Face Aging with Identity-Preserved Conditional Generative Adversarial Networks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 7939–7947. Google ScholarCross Ref
    64. Ayush Tewari, Mohamed Elgharib, Mallikarjun B R., Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, and Christian Theobalt. 2020. PIE: Portrait Image Embedding for Semantic Control. arXiv:2009.09485 [cs.CV]Google Scholar
    65. Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, and Daniel Cohen-Or. 2021. Designing an Encoder for StyleGAN Image Manipulation. arXiv:2102.02766 [cs.CV]Google Scholar
    66. Government U.S. 2014. Image taken from here. Licensed under the Public Domain as a work of the U.S. federal government.Google Scholar
    67. Yuri Viazovetskyi, Vladimir Ivashkin, and Evgeny Kashin. 2020. StyleGAN2 Distillation for Feed-forward Image Manipulation. arXiv preprint arXiv:2003.03581 (2020).Google Scholar
    68. Andrey Voynov and Artem Babenko. 2020. Unsupervised Discovery of Interpretable Directions in the GAN Latent Space. arXiv preprint arXiv:2002.03754 (2020).Google Scholar
    69. W. Wang, Z. Cui, Y. Yan, J. Feng, S. Yan, X. Shu, and N. Sebe. 2016. Recurrent Face Aging. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2378–2386. Google ScholarCross Ref
    70. Zongze Wu, Dani Lischinski, and Eli Shechtman. 2020. StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation. arXiv:2011.12799 [cs.CV]Google Scholar
    71. Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, and Ming-Hsuan Yang. 2021. GAN Inversion: A Survey. arXiv:2101.05278 [cs.CV]Google Scholar
    72. Ceyuan Yang, Yujun Shen, and Bolei Zhou. 2020. Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis. arXiv:1911.09267 [cs.CV]Google Scholar
    73. Hongyu Yang, Di Huang, Yunhong Wang, and Anil K. Jain. 2019. Learning Face Age Progression: A Pyramid Architecture of GANs. arXiv:1711.10352 [cs.CV]Google Scholar
    74. Xu Yao, Gilles Puy, Alasdair Newson, Yann Gousseau, and Pierre Hellier. 2020. High Resolution Face Age Editing. CoRR abs/2005.04410 (2020).Google Scholar
    75. Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. 2017. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. 2868–2876. Google ScholarCross Ref
    76. Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586–595.Google ScholarCross Ref
    77. Zhifei Zhang, Yang Song, and Hairong Qi. 2017. Age Progression/Regression by Conditional Adversarial Autoencoder. arXiv:1702.08423 [cs.CV]Google Scholar
    78. Jiapeng Zhu, Yujun Shen, Deli Zhao, and Bolei Zhou. 2020. In-domain gan inversion for real image editing. arXiv preprint arXiv:2004.00049 (2020).Google Scholar
    79. Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A Efros. 2016. Generative visual manipulation on the natural image manifold. In European conference on computer vision. Springer, 597–613.Google ScholarCross Ref
    80. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017a. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Computer Vision (ICCV), 2017 IEEE International Conference on.Google ScholarCross Ref
    81. Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros, Oliver Wang, and Eli Shechtman. 2017b. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems.Google Scholar


ACM Digital Library Publication:



Overview Page: