“Deep face normalization” by Nagano, Luo, Wang, Seo, Xing, et al. … – ACM SIGGRAPH HISTORY ARCHIVES

“Deep face normalization” by Nagano, Luo, Wang, Seo, Xing, et al. …

  • 2019 SA Technical Papers_Nagano_Deep face normalization

Conference:


Type(s):


Title:

    Deep face normalization

Session/Category Title:   Synthesis in the Arvo


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    From angling smiles to duck faces, all kinds of facial expressions can be seen in selfies, portraits, and Internet pictures. These photos are taken from various camera types, and under a vast range of angles and lighting conditions. We present a deep learning framework that can fully normalize unconstrained face images, i.e., remove perspective distortions, relight to an evenly lit environment, and predict a frontal and neutral face. Our method can produce a high resolution image while preserving important facial details and the likeness of the subject, along with the original background. We divide this ill-posed problem into three consecutive normalization steps, each using a different generative adversarial network that acts as an image generator. Perspective distortion removal is performed using a dense flow field predictor. A uniformly illuminated face is obtained using a lighting translation network, and the facial expression is neutralized using a generalized facial expression synthesis framework combined with a regression network based on deep features for facial recognition. We introduce new data representations for conditional inference, as well as training methods for supervised learning to ensure that different expressions of the same person can yield to not only a plausible but also a similar neutral face. We demonstrate our results on a wide range of challenging images collected in the wild. Key applications of our method range from robust image-based 3D avatar creation, portrait manipulation, to facial enhancement and reconstruction tasks for crime investigation. We also found through an extensive user study, that our normalization results can be hardly distinguished from ground truth ones if the person is not familiar.

References:


    1. Y. Adini, Y. Moses, and S. Ullman. 1997. Face recognition: the problem of compensating for changes in illumination direction. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 7 (July 1997), 721–732.Google ScholarDigital Library
    2. Hadar Averbuch-Elor, Daniel Cohen-Or, Johannes Kopf, and Michael F. Cohen. 2017. Bringing Portraits to Life. ACM Trans. Graph. 36, 4 (2017), to appear.Google ScholarDigital Library
    3. Jonathan T. Barron. 2015. Convolutional Color Constancy. In IEEE ICCV (ICCV ’15). IEEE Computer Society, Washington, DC, USA, 379–387. Google ScholarDigital Library
    4. Anil Bas and William A. P. Smith. 2018. Statistical transformer networks: learning shape and appearance models via self supervision. CoRR abs/1804.02541 (2018). arXiv:1804.02541 http://arxiv.org/abs/1804.02541Google Scholar
    5. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’99). 187–194.Google ScholarDigital Library
    6. Xavier P. Burgos-Artizzu, Matteo Ruggero Ronchi, and Pietro Perona. 2014. Distance Estimation of an Unknown Person from a Portrait. In ECCV. Springer International Publishing, Cham, 313–327.Google Scholar
    7. Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2014. Facewarehouse: A 3d facial expression database for visual computing. IEEE TVCG 20, 3 (2014), 413–425.Google Scholar
    8. Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. In IEEE CVPR.Google Scholar
    9. Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, and William T. Freeman. 2017. Synthesizing Normalized Faces From Facial Identity Features. In IEEE CVPR.Google Scholar
    10. R. L. Cook and K. E. Torrance. 1982. A Reflectance Model for Computer Graphics. ACM Trans. Graph. 1, 1 (Jan. 1982), 7–24.Google ScholarDigital Library
    11. Shichuan Du, Yong Tao, and Aleix M Martinez. 2014. Compound facial expressions of emotion. Proceedings of the National Academy of Sciences 111, 15 (2014), E1454–E1462.Google ScholarCross Ref
    12. Federal Bureau of Investigation. 2019. FBI Most Wanted. https://www.fbi.gov/wanted.Google Scholar
    13. Arturo Flores, Eric Christiansen, David Kriegman, and Serge Belongie. 2013. Camera Distance from Face Images. In Advances in Visual Computing. Springer Berlin Heidelberg, Berlin, Heidelberg, 513–522.Google Scholar
    14. Ohad Fried, Eli Shechtman, Dan B Goldman, and Adam Finkelstein. 2016. Perspective-aware Manipulation of Portrait Photos. ACM Trans. Graph. (July 2016).Google ScholarDigital Library
    15. Jiahao Geng, Tianjia Shao, Youyi Zheng, Yanlin Weng, and Kun Zhou. 2018. Warp-guided GANs for Single-photo Facial Animation. ACM Trans. Graph. 37, 6, Article 231 (Dec. 2018), 12 pages.Google ScholarDigital Library
    16. Kyle Genova, Forrester Cole, Aaron Maschinot, Aaron Sarna, Daniel Vlasic, and William T. Freeman. 2018. Unsupervised Training for 3D Morphable Model Regression. In IEEE CVPR.Google Scholar
    17. A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman. 2001. From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 6 (June 2001), 643–660.Google ScholarDigital Library
    18. Abhijeet Ghosh, Graham Fyffe, Borom Tunwattanapong, Jay Busch, Xueming Yu, and Paul Debevec. 2011. Multiview Face Capture Using Polarized Spherical Gradient Illumination. ACM Trans. Graph. 30, 6, Article 129 (2011), 10 pages.Google ScholarDigital Library
    19. R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker. 2008. Multi-PIE. In 2008 8th IEEE International Conference on Automatic Face Gesture Recognition. 1–8.Google Scholar
    20. Tal Hassner, Shai Harel, Eran Paz, and Roee Enbar. 2015. Effective Face Frontalization in Unconstrained Images. In IEEE CVPR.Google Scholar
    21. Liwen Hu, Shunsuke Saito, Lingyu Wei, Koki Nagano, Jaewoo Seo, Jens Fursund, Iman Sadeghi, Carrie Sun, Yen-Chun Chen, and Hao Li. 2017a. Avatar Digitization From a Single Image For Real-Time Rendering. ACM Trans. Graph. 36, 6 (2017).Google ScholarDigital Library
    22. Y. Hu, B. Wang, and S. Lin. 2017b. FC4: Fully Convolutional Color Constancy with Confidence-Weighted Pooling. In IEEE CVPR. 330–339.Google Scholar
    23. Yibo Hu, Xiang Wu, Bing Yu, Ran He, and Zhenan Sun. 2018a. Pose-Guided Photorealistic Face Rotation. In IEEE CVPR.Google Scholar
    24. Yibo Hu, Xiang Wu, Bing Yu, Ran He, and Zhenan Sun. 2018b. Pose-Guided Photorealistic Face Rotation. In IEEE CVPR.Google Scholar
    25. Rui Huang, Shu Zhang, Tianyu Li, and Ran He. 2017. Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis. In IEEE ICCV.Google Scholar
    26. P. Isola, J. Zhu, T. Zhou, and A. A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. In IEEE CVPR. 5967–5976.Google Scholar
    27. itSeez3D: Avatar SDK. 2019. https://avatarsdk.com.Google Scholar
    28. Justin Johnson, Alexandre Alahi, and Fei-Fei Li. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. CoRR abs/1603.08155 (2016). http://arxiv.org/abs/1603.08155Google Scholar
    29. Tero Karras, Samuli Laine, and Timo Aila. 2018. A Style-Based Generator Architecture for Generative Adversarial Networks. CoRR abs/1812.04948 (2018). http://arxiv.org/abs/1812.04948Google Scholar
    30. Vahid Kazemi and Josephine Sullivan. 2014. One millisecond face alignment with an ensemble of regression trees. In IEEE CVPR. 1867–1874.Google Scholar
    31. Hyeongwoo Kim, Pablo Carrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Niessner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, and Christian Theobalt. 2018. Deep Video Portraits. ACM Trans. Graph. 37, 4, Article 163 (July 2018), 14 pages.Google ScholarDigital Library
    32. Oliver Langner, Ron Dotsch, Gijsbert Bijlstra, Daniel HJ Wigboldus, Skyler T Hawk, and AD Van Knippenberg. 2010. Presentation and validation of the Radboud Faces Database. Cognition and emotion 24, 8 (2010), 1377–1388.Google Scholar
    33. Chen Li, Kun Zhou, and Stephen Lin. 2014. Intrinsic Face Image Decomposition with Human Face Priors. In ECCV. 218–233.Google Scholar
    34. Ce Liu, Heung-Yeung Shum, and Chang-Shui Zhang. 2001. A two-step approach to hallucinating faces: global parametric model and local nonparametric model. In IEEE CVPR, Vol. 1. I–I.Google Scholar
    35. Loom.ai. 2019. http://www.loom.ai.Google Scholar
    36. Debbie S Ma, Joshua Correll, and Bernd Wittenbrink. 2015. The Chicago face database: A free stimulus set of faces and norming data. Behavior research methods 47, 4 (2015), 1122–1135.Google Scholar
    37. Koki Nagano, Jaewoo Seo, Jun Xing, Lingyu Wei, Zimo Li, Shunsuke Saito, Aviral Agarwal, Jens Fursund, and Hao Li. 2018. paGAN: Real-time Avatars Using Dynamic Textures. ACM Trans. Graph. 37, 6, Article 258 (Dec. 2018), 12 pages.Google ScholarDigital Library
    38. Patrick Pérez, Michel Gangnet, and Andrew Blake. 2003. Poisson Image Editing. ACM Trans. Graph. 22, 3 (July 2003), 313–318.Google ScholarDigital Library
    39. Pinscreen. 2019. http://www.pinscreen.com.Google Scholar
    40. Ravi Ramamoorthi and Pat Hanrahan. 2001. An efficient representation for irradiance environment maps. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM, 497–500.Google ScholarDigital Library
    41. Shunsuke Saito, Lingyu Wei, Liwen Hu, Koki Nagano, and Hao Li. 2017. Photorealistic Facial Texture Inference Using Deep Neural Networks. In IEEE CVPR.Google Scholar
    42. Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A Unified Embedding for Face Recognition and Clustering. In IEEE CVPR.Google Scholar
    43. Soumyadip Sengupta, Angjoo Kanazawa, Carlos D. Castillo, and David W. Jacobs. 2018. SfSNet: Learning Shape, Refectance and Illuminance of Faces in the Wild. In IEEE CVPR.Google Scholar
    44. Amnon Shashua and Tammy Riklin-Raviv. 2001. The Quotient Image: Class-Based Re-Rendering and Recognition with Varying Illuminations. IEEE Trans. Pattern Anal. Mach. Intell. 23, 2 (Feb. 2001), 129–139.Google ScholarDigital Library
    45. YiChang Shih, Wei-Sheng Lai, and Liang Chia-Kai. 2019. Distortion-Free Wide-Angle Portraits on Camera Phones. ACM Trans. Graph. 38, 4 (2019).Google ScholarDigital Library
    46. YiChang Shih, Sylvain Paris, Connelly Barnes, William T. Freeman, and Frédo Durand. 2014. Style Transfer for Headshot Portraits. ACM Trans. Graph. 33, 4, Article 148 (July 2014), 14 pages.Google ScholarDigital Library
    47. Zhixin Shu, Sunil Hadap, Eli Shechtman, Kalyan Sunkavalli, Sylvain Paris, and Dimitris Samaras. 2017. Portrait Lighting Transfer Using a Mass Transport Approach. ACM Trans. Graph. 36, 4, Article 145a (Oct. 2017).Google ScholarDigital Library
    48. K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).Google Scholar
    49. Lingxiao Song, Zhihe Lu, Ran He, Zhenan Sun, and Tieniu Tan. 2017. Geometry Guided Adversarial Facial Expression Synthesis. arXiv preprint arXiv:1712.03474 (2017).Google Scholar
    50. Tiancheng Sun, Jonathan Barron, Yun-Ta Tsai, Zexiang Xu, Xueming Yu, Graham Fyffe, Christoph Rhemann, Jay Busch, Paul Debevec, and Ravi Ramamoorthi. 2019. Single Image Portrait Relighting. ACM Trans. Graph. 38, 4 (2019).Google ScholarDigital Library
    51. Christian Szegedy, Sergey Ioffe, and Vincent Vanhoucke. 2016. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. In ICLR Workshop.Google Scholar
    52. Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016. Face2face: Real-time face capture and reenactment of rgb videos. In IEEE CVPR. 2387–2395.Google Scholar
    53. Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. In IEEE CVPR.Google Scholar
    54. Y. Wang, L. Zhang, Z. Liu, G. Hua, Z. Wen, Z. Zhang, and D. Samaras. 2009. Face Relighting from a Single Image under Arbitrary Unknown Lighting Conditions. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 11 (Nov 2009), 1968–1984.Google Scholar
    55. Brittany Ward, Max Ward, Ohad Fried, and Boris Paskhover. 2018. Nasal distortion in short-distance photographs: The selfie effect. JAMA Facial Plastic Surgery 20, 4 (2018), 333–335. arXiv:/data/journals/faci/937383/jamafacialward2018ld180002.pdfGoogle ScholarCross Ref
    56. Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional pose machines. In IEEE CVPR.Google Scholar
    57. Chenglei Wu, Takaaki Shiratori, and Yaser Sheikh. 2018b. Deep Incremental Learning for Efficient High-fidelity Face Tracking. ACM Trans. Graph. 37, 6, Article 234 (Dec. 2018), 12 pages.Google ScholarDigital Library
    58. Xiang Wu, Ran He, Zhenan Sun, and Tieniu Tan. 2018a. A light CNN for deep face representation with noisy labels. IEEE Transactions on Information Forensics and Security 13, 11 (2018), 2884–2896.Google ScholarCross Ref
    59. Xiangyu Zhu, Z. Lei, Junjie Yan, D. Yi, and S. Z. Li. 2015. High-fidelity Pose and Expression Normalization for face recognition in the wild. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 787–796. Google ScholarCross Ref
    60. Shuco Yamaguchi, Shunsuke Saito, Koki Nagano, Yajie Zhao, Weikai Chen, Kyle Olszewski, Shigeo Morishima, and Hao Li. 2018. High-fidelity Facial Reflectance and Geometry Inference from an Unconstrained Image. ACM Trans. Graph. 37, 4, Article 162 (July 2018), 14 pages.Google ScholarDigital Library
    61. Yajie Zhao, Zeng Huang, Tianye Li, Weikai Chen, Chloe LeGendre, Xinglei Ren, Jun Xing, Ari Shapiro, and Hao Li. 2019. Learning Perspective Undistortion of Portraits. arXiv preprint arXiv:1905.07515 (2019).Google Scholar
    62. Andrey Zhmoginov and Mark Sandler. 2016. Inverting Face Embeddings with Convolutional Neural Networks. https://arxiv.org/abs/1606.04189Google Scholar


ACM Digital Library Publication:



Overview Page:



Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org