“Authentic volumetric avatars from a phone scan” by Cao, Simon, Kim, Schwartz, Zollhoefer, et al. …

  • ©Chen Cao, Tomas Simon, Jin Kyu Kim, Gabriel Schwartz, Michael Zollhoefer, Shun-Suke Saito, Stephen Lombardi, Shih-En Wei, Danielle Belko, Shoou-I Yu, Yaser Sheikh, and Jason Saragih


    Creating photorealistic avatars of existing people currently requires extensive person-specific data capture, which is usually only accessible to the VFX industry and not the general public. Our work aims to address this drawback by relying only on a short mobile phone capture to obtain a drivable 3D head avatar that matches a person’s likeness faithfully. In contrast to existing approaches, our architecture avoids the complex task of directly modeling the entire manifold of human appearance, aiming instead to generate an avatar model that can be specialized to novel identities using only small amounts of data. The model dispenses with low-dimensional latent spaces that are commonly employed for hallucinating novel identities, and instead, uses a conditional representation that can extract person-specific information at multiple scales from a high resolution registered neutral phone scan. We achieve high quality results through the use of a novel universal avatar prior that has been trained on high resolution multi-view video captures of facial performances of hundreds of human subjects. By fine-tuning the model using inverse rendering we achieve increased realism and personalize its range of motion. The output of our approach is not only a high-fidelity 3D head avatar that matches the person’s facial shape and appearance, but one that can also be driven using a jointly discovered shared global expression space with disentangled controls for gaze direction. Via a series of experiments we demonstrate that our avatars are faithful representations of the subject’s likeness. Compared to other state-of-the-art methods for lightweight avatar creation, our approach exhibits superior visual quality and animateability.


    1. Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).Google ScholarCross Ref
    2. Rameen Abdal, Yipeng Qin, and Peter Wonka. 2020. Image2StyleGAN++: How to Edit the Embedded Images?. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. KAUST Repository Item: Exported on 2020-10-01 Acknowledged KAUST grant number(s): CRG2018-3730 Acknowledgements: This work was supported by the KAUST Office of Sponsored Research (OSR) under Award No. OSR-CRG2018-3730. Google ScholarCross Ref
    3. Kaan Akşit, Ward Lopes, Jonghyun Kim, Peter Shirley, and David Luebke. 2017. Near-Eye Varifocal Augmented Reality Display Using See-through Screens. ACM Trans. Graph. 36, 6, Article 189 (nov 2017), 13 pages. Google ScholarDigital Library
    4. Oleg Alexander, Graham Fyffe, Jay Busch, Xueming Yu, Ryosuke Ichikari, Andrew Jones, Paul Debevec, Jorge Jimenez, Etienne Danvoye, Bernardo Antionazzi, Mike Eheler, Zybnek Kysela, and Javier von der Pahlen. 2013. Digital Ira: Creating a Real-time Photoreal Digital Actor. In ACM SIGGRAPH 2013 Posters (SIGGRAPH ’13). ACM, New York, NY, USA, 1:1–1:1.Google Scholar
    5. Oleg Alexander, Mike Rogers, William Lambeth, Jen-Yuan Chiang, Wan-Chun Ma, Chuan-Chang Wang, and Paul Debevec. 2010. The Digital Emily Project: Achieving a Photorealistic Digital Actor. IEEE Computer Graphics and Applications 30, 4 (2010), 20–31.Google ScholarDigital Library
    6. Timur Bagautdinov, Chenglei Wu, Jason Saragih, Pascal Fua, and Yaser Sheikh. 2018. Modeling Facial Geometry Using Compositional VAEs. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3877–3886. Google ScholarCross Ref
    7. Thabo Beeler, Fabian Hahn, Derek Bradley, Bernd Bickel, Paul Beardsley, Craig Gotsman, Robert W. Sumner, and Markus Gross. 2011. High-quality Passive Facial Performance Capture Using Anchor Frames. ACM Trans. Graph. 30, 4 (2011), 75:1–75:10.Google ScholarDigital Library
    8. Bernd Bickel, Mario Botsch, Roland Angst, Wojciech Matusik, Miguel Otaduy, Hanspeter Pfister, and Markus Gross. 2007. Multi-scale Capture of Facial Geometry and Motion. ACM Trans. Graph. 26, 3 (2007), 33:1–33:10.Google ScholarDigital Library
    9. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’99). ACM Press/Addison-Wesley Publishing Co., New York, NY, USA, 187–194.Google ScholarDigital Library
    10. J. Booth, A. Roussos, S. Zafeiriou, A. Ponniah, and D. Dunaway. 2016. A 3D Morphable Model learnt from 10,000 faces. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
    11. George Borshukov and J. P. Lewis. 2003. Realistic Human Face Rendering for “The Matrix Reloaded”. In ACM SIGGRAPH 2003 Sketches & Applications (SIGGRAPH ’03). ACM, New York, NY, USA, 16:1–16:1.Google Scholar
    12. Derek Bradley, Wolfgang Heidrich, Tiberiu Popa, and Alla Sheffer. 2010. High Resolution Passive Facial Performance Capture. ACM Trans. Graph. 29, 4 (2010), 41:1–41:10.Google ScholarDigital Library
    13. Chen Cao, Derek Bradley, Kun Zhou, and Thabo Beeler. 2015. Real-Time High-Fidelity Facial Performance Capture. ACM Trans. Graph. 34, 4, Article 46 (jul 2015), 9 pages. Google ScholarDigital Library
    14. Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2014. FaceWarehouse: A 3D Facial Expression Database for Visual Computing. IEEE Transactions on Visualization and Computer Graphics 20, 3 (mar 2014), 413–425. Google ScholarDigital Library
    15. Chen Cao, Hongzhi Wu, Yanlin Weng, Tianjia Shao, and Kun Zhou. 2016. Real-Time Facial Animation with Image-Based Dynamic Avatars. ACM Trans. Graph. 35, 4, Article 126 (jul 2016), 12 pages. Google ScholarDigital Library
    16. Eric Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and Gordon Wetzstein. 2020. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis. In arXiv.Google Scholar
    17. Eric R. Chan, Connor Z. Lin, Matthew A. Chan, Koki Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas J. Guibas, Jonathan Tremblay, Sameh Khamis, Tero Karras, and Gordon Wetzstein. 2021. Efficient Geometry-aware 3D Generative Adversarial Networks. CoRR abs/2112.07945 (2021). arXiv:2112.07945 https://arxiv.org/abs/2112.07945Google Scholar
    18. Bernhard Egger, William A. P. Smith, Ayush Tewari, Stefanie Wuhrer, Michael Zollhoefer, Thabo Beeler, Florian Bernard, Timo Bolkart, Adam Kortylewski, Sami Romdhani, Christian Theobalt, Volker Blanz, and Thomas Vetter. 2020. 3D Morphable Face Models—Past, Present, and Future. ACM Trans. Graph. 39, 5, Article 157 (jun 2020), 38 pages. Google ScholarDigital Library
    19. Robert M. French. 1994. Catastrophic Forgetting in Connectionist Networks: Causes, Consequences and Solutions. In Trends in Cognitive Sciences. 128–135.Google Scholar
    20. Yasutaka Furukawa and Jean Ponce. 2009. Dense 3D motion capture for human faces. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’09). IEEE Computer Society, 1674–1681.Google ScholarCross Ref
    21. Graham Fyffe, Andrew Jones, Oleg Alexander, Ryosuke Ichikari, and Paul Debevec. 2014. Driving High-Resolution Facial Scans with Video Performance Capture. ACM Trans. Graph. 34, 1 (2014), 8:1–8:14.Google ScholarDigital Library
    22. Guy Gafni, Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2021. Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8649–8658.Google ScholarCross Ref
    23. Pablo Garrido, Michael Zollhöfer, Dan Casas, Levi Valgaerts, Kiran Varanasi, Patrick Pérez, and Christian Theobalt. 2016. Reconstruction of Personalized 3D Face Rigs from Monocular Video. ACM Trans. Graph. 35, 3, Article 28 (may 2016), 15 pages. Google ScholarDigital Library
    24. Abhijeet Ghosh, Graham Fyffe, Borom Tunwattanapong, Jay Busch, Xueming Yu, and Paul Debevec. 2011. Multiview Face Capture Using Polarized Spherical Gradient Illumination. ACM Trans. Graph. 30, 6 (2011), 129:1–129:10.Google ScholarDigital Library
    25. Philip-William Grassal, Malte Prinzler, Titus Leistner, Carsten Rother, Matthias Nießner, and Justus Thies. 2021. Neural Head Avatars from Monocular RGB Videos. arXiv preprint arXiv:2112.01554 (2021).Google Scholar
    26. Ralph Gross, Iain Matthews, and Simon Baker. 2005. Generic vs. Person Specific Active Appearance Models. Image Vision Comput. 23, 12 (nov 2005), 1080–1093. Google ScholarDigital Library
    27. David Ha, Andrew Dai, and Quoc V. Le. 2017a. HyperNetworks. https://openreview.net/pdf?id=rkpACe1lxGoogle Scholar
    28. Hyowon Ha, Michal Perdoch, Hatem Alismail, In So Kweon, and Yaser Sheikh. 2017b. Deltille grids for geometric camera calibration. In Proceedings of the IEEE International Conference on Computer Vision. 5344–5352.Google Scholar
    29. Liwen Hu, Shunsuke Saito, Lingyu Wei, Koki Nagano, Jaewoo Seo, Jens Fursund, Iman Sadeghi, Carrie Sun, Yen-Chun Chen, and Hao Li. 2017. Avatar digitization from a single image for real-time rendering. ACM Trans. Graph. 36, 6 (2017), 195:1–195:14.Google ScholarDigital Library
    30. Jin Huang, Xiaohan Shi, Xinguo Liu, Kun Zhou, Li-Yi Wei, Shang-Hua Teng, Hujun Bao, Baining Guo, and Heung-Yeung Shum. 2006. Subspace gradient domain mesh deformation. In ACM SIGGRAPH 2006 Papers. 1126–1134.Google ScholarDigital Library
    31. Xiaolei Huang, Song Zhang, Yang Wang, Dimitris N. Metaxas, and Dimitris Samaras. 2004. A Hierarchical Framework For High Resolution Facial Expression Tracking. In Proceedings of the 2004 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops ’04). IEEE Computer Society, Washington, DC, USA, 22.Google ScholarCross Ref
    32. Alexandru Eugen Ichim, Sofien Bouaziz, and Mark Pauly. 2015. Dynamic 3D Avatar Creation from Hand-held Video Input. ACM Trans. Graph. 34, 4 (2015), 45:1–45:14.Google ScholarDigital Library
    33. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. CVPR (2017).Google Scholar
    34. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision.Google ScholarCross Ref
    35. Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2021a. Alias-Free Generative Adversarial Networks. CoRR abs/2106.12423 (2021). arXiv:2106.12423 https://arxiv.org/abs/2106.12423Google Scholar
    36. Tero Karras, Samuli Laine, and Timo Aila. 2021b. A Style-Based Generator Architecture for Generative Adversarial Networks. IEEE Trans. Pattern Anal. Mach. Intell. 43, 12 (2021), 4217–4228. Google ScholarDigital Library
    37. Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and Improving the Image Quality of StyleGAN. In Proc. CVPR.Google ScholarCross Ref
    38. Ira Kemelmacher-Shlizerman. 2013. Internet based morphable model. In Proceedings of the IEEE international conference on computer vision. 3256–3263.Google ScholarDigital Library
    39. Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Niessner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, and Christian Theobalt. 2018. Deep Video Portraits. ACM Trans. Graph. 37, 4, Article 163 (jul 2018), 14 pages. Google ScholarDigital Library
    40. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. http://arxiv.org/abs/1412.6980 cite arxiv:1412.6980Comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015.Google Scholar
    41. Martin Klaudiny and Adrian Hilton. 2012. High-Detail 3D Capture and Non-sequential Alignment of Facial Performance. In Proceedings of the 2nd International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission (3DIMPVT ’12). IEEE Computer Society, 17–24.Google ScholarDigital Library
    42. M. Kundera. 1999. Immortality. HarperCollins.Google Scholar
    43. Samuli Laine, Tero Karras, Timo Aila, Antti Herva, Shunsuke Saito, Ronald Yu, Hao Li, and Jaakko Lehtinen. 2017. Production-level Facial Performance Capture Using Deep Convolutional Neural Networks. In Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation. Article 10, 10 pages.Google ScholarDigital Library
    44. Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2017. Fader Networks: Manipulating Images by Sliding Attributes. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 5969–5978.Google Scholar
    45. Alexandros Lattas, Stylianos Moschoglou, Baris Gecer, Stylianos Ploumpis, Vasileios Triantafyllou, Abhijeet Ghosh, and Stefanos Zafeiriou. 2020. AvatarMe: Realistically Renderable 3D Facial Reconstruction “In-the-Wild”. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
    46. Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Baris Gecer, Abhijeet Ghosh, and Stefanos P Zafeiriou. 2021. AvatarMe++: Facial Shape and BRDF Inference with Photorealistic Rendering-Aware GANs. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).Google ScholarCross Ref
    47. Jiaman Li, Zhengfei Kuang, Yajie Zhao, Mingming He, Karl Bladin, and Hao Li. 2020. Dynamic Facial Asset and Rig Generation from a Single Scan. ACM Trans. Graph. 39, 6, Article 215 (nov 2020), 18 pages. Google ScholarDigital Library
    48. Shanchuan Lin, Linjie Yang, Imran Saleemi, and Soumyadip Sengupta. 2021. Robust High-Resolution Video Matting with Temporal Guidance. arXiv preprint arXiv:2108.11515 (2021).Google Scholar
    49. Stephen Lombardi, Jason Saragih, Tomas Simon, and Yaser Sheikh. 2018. Deep Appearance Models for Face Rendering. ACM Trans. Graph. 37, 4, Article 68 (July 2018), 13 pages.Google ScholarDigital Library
    50. Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, and Yaser Sheikh. 2019. Neural Volumes: Learning Dynamic Renderable Volumes from Images. ACM Trans. Graph. 38, 4, Article 65 (jul 2019), 14 pages. Google ScholarDigital Library
    51. Stephen Lombardi, Tomas Simon, Gabriel Schwartz, Michael Zollhoefer, Yaser Sheikh, and Jason Saragih. 2021. Mixture of Volumetric Primitives for Efficient Neural Rendering. ACM Trans. Graph. 40, 4, Article 59 (jul 2021), 13 pages. Google ScholarDigital Library
    52. Huiwen Luo, Koki Nagano, Han-Wei Kung, Qingguo Xu, Zejian Wang, Lingyu Wei, Liwen Hu, and Hao Li. 2021. Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11662–11672.Google ScholarCross Ref
    53. Shugao Ma, Tomas Simon, Jason M. Saragih, Dawei Wang, Yuecheng Li, Fernando De la Torre, and Yaser Sheikh. 2021. Pixel Codec Avatars. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19–25, 2021. Computer Vision Foundation / IEEE, 64–73. https://openaccess.thecvf.com/content/CVPR2021/html/Ma_Pixel_Codec_Avatars_CVPR_2021_paper.htmlGoogle Scholar
    54. Wan-Chun Ma, Tim Hawkins, Pieter Peers, Charles-Felix Chabert, Malte Weiss, and Paul Debevec. 2007. Rapid Acquisition of Specular and Diffuse Normal Maps from Polarized Spherical Gradient Illumination. In Proceedings of the 18th Eurographics Conference on Rendering Techniques (EGSR ’07). Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, 183–194.Google ScholarDigital Library
    55. Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In in ICML Workshop on Deep Learning for Audio, Speech and Language Processing.Google Scholar
    56. Masahiro Mori, Karl F. MacDorman, and Norri Kageki. 2012. The Uncanny Valley [From the Field]. IEEE Robotics Automation Magazine 19, 2 (2012), 98–100. Google ScholarCross Ref
    57. Koki Nagano, Jaewoo Seo, Jun Xing, Lingyu Wei, Zimo Li, Shunsuke Saito, Aviral Agarwal, Jens Fursund, and Hao Li. 2018. PaGAN: Real-Time Avatars Using Dynamic Textures. ACM Trans. Graph. 37, 6, Article 258 (dec 2018), 12 pages. Google ScholarDigital Library
    58. Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. In European conference on computer vision. Springer, 483–499.Google ScholarCross Ref
    59. Anjul Patney, Marco Salvi, Joohwan Kim, Anton Kaplanyan, Chris Wyman, Nir Benty, David Luebke, and Aaron Lefohn. 2016. Towards Foveated Rendering for Gaze-Tracked Virtual Reality. ACM Trans. Graph. 35, 6, Article 179 (nov 2016), 12 pages. Google ScholarDigital Library
    60. F. Pighin and J.P. Lewis. 2006. Performance-Driven Facial Animation. In ACM SIGGRAPH Courses.Google Scholar
    61. Stylianos Ploumpis, Evangelos Ververas, Eimear O’Sullivan, Stylianos Moschoglou, Haoyang Wang, Nick Pears, William Smith, Baris Gecer, and Stefanos P Zafeiriou. 2020. Towards a complete 3D morphable model of the human head. IEEE transactions on pattern analysis and machine intelligence (2020).Google ScholarCross Ref
    62. Yunchen Pu, Zhe Gan, Ricardo Henao, Xin Yuan, Chunyuan Li, Andrew Stevens, and Lawrence Carin. 2016. Variational autoencoder for deep learning of images, labels and captions. Advances in neural information processing systems 29 (2016), 2352–2360.Google Scholar
    63. Sami Romdhani and Thomas Vetter. 2005. Estimating 3D Shape and Texture Using Pixel Intensity, Edges, Specular Highlights, Texture Constraints and a Prior. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) – Volume 2 – Volume 02 (CVPR ’05). IEEE Computer Society, USA, 986–993. Google ScholarDigital Library
    64. O. Ronneberger, P. Fischer, and T. Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) (LNCS, Vol. 9351). Springer, 234–241. http://lmb.informatik.uni-freiburg.de/Publications/2015/RFB15a (available on arXiv:1505.04597 [cs.CV]).Google Scholar
    65. Gabriel Schwartz, Shih-En Wei, Te-Li Wang, Stephen Lombardi, Tomas Simon, Jason Saragih, and Yaser Sheikh. 2020. The Eyes Have It: An Integrated Eye and Face Model for Photorealistic Facial Animation. ACM Trans. Graph. 39, 4, Article 91 (jul 2020), 15 pages. Google ScholarDigital Library
    66. Mike Seymour, Chris Evans, and Kim Libreri. 2017. Meet Mike: Epic Avatars. In ACM SIGGRAPH 2017 VR Village (Los Angeles, California) (SIGGRAPH ’17). ACM, New York, NY, USA, Article 12, 2 pages.Google ScholarDigital Library
    67. Michael Sheehan and Michael Nachman. 2014. Morphological and population genomic evidence that human faces have evolved to signal individual identity. Nature communications 5 (09 2014), 4800. Google ScholarCross Ref
    68. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.1556Google Scholar
    69. J Rafael Tena, Fernando De la Torre, and Iain Matthews. 2011. Interactive region-based linear 3d face models. In ACM SIGGRAPH 2011 papers. 1–10.Google ScholarDigital Library
    70. A. Tewari, O. Fried, J. Thies, V. Sitzmann, S. Lombardi, K. Sunkavalli, R. Martin-Brualla, T. Simon, J. Saragih, M. Nießner, R. Pandey, S. Fanello, G. Wetzstein, J.-Y. Zhu, C. Theobalt, M. Agrawala, E. Shechtman, D. B Goldman, and M. Zollhöfer. 2020. State of the Art on Neural Rendering. Computer Graphics Forum (EG STAR 2020) (2020).Google Scholar
    71. Ayush Tewari, Justus Thies, Ben Mildenhall, Pratul Srinivasan, Edgar Tretschk, Yifan Wang, Christoph Lassner, Vincent Sitzmann, Ricardo Martin-Brualla, Stephen Lombardi, Tomas Simon, Christian Theobalt, Matthias Niessner, Jonathan T. Barron, Gordon Wetzstein, Michael Zollhoefer, and Vladislav Golyanik. 2021. Advances in Neural Rendering. arXiv:2111.05849 [cs.GR]Google Scholar
    72. Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2019. Deferred Neural Rendering: Image Synthesis Using Neural Textures. ACM Trans. Graph. 38, 4, Article 66 (jul 2019), 12 pages. Google ScholarDigital Library
    73. J. Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, and M. Nießner. 2016. Face2Face: Real-time Face Capture and Reenactment of RGB Videos. In Proc. Computer Vision and Pattern Recognition (CVPR), IEEE.Google Scholar
    74. Justus Thies, Michael Zollhöfer, Christian Theobalt, Marc Stamminger, and Matthias Niessner. 2018. <i>Headon</i>: Real-Time Reenactment of Human Portrait Videos. ACM Trans. Graph. 37, 4, Article 164 (jul 2018), 13 pages. Google ScholarDigital Library
    75. Daniel Vlasic, Matthew Brand, Hanspeter Pfister, and Jovan Popović. 2005. Face Transfer with Multilinear Models. In ACM SIGGRAPH 2005 Papers (Los Angeles, California) (SIGGRAPH ’05). Association for Computing Machinery, New York, NY, USA, 426–433. Google ScholarDigital Library
    76. Shih-En Wei, Jason Saragih, Tomas Simon, Adam W. Harley, Stephen Lombardi, Michal Perdoch, Alexander Hypes, Dawei Wang, Hernan Badino, and Yaser Sheikh. 2019. VR Facial Animation via Multiview Image Translation. ACM Trans. Graph. 38, 4, Article 67 (jul 2019), 16 pages. Google ScholarDigital Library
    77. Thibaut Weise, Sofien Bouaziz, Hao Li, and Mark Pauly. 2011. Realtime performance-based facial animation. ACM transactions on graphics (TOG) 30, 4 (2011), 1–10.Google Scholar
    78. E. Wood, T. Baltrusaitis, L. P. Morency, P. Robinson, and A. Bulling. 2016. A 3D morphable eye region model for gaze estimation. In ECCV.Google Scholar
    79. Chenglei Wu, Derek Bradley, Markus Gross, and Thabo Beeler. 2016. An anatomically-constrained local deformation model for monocular face capture. ACM transactions on graphics (TOG) 35, 4 (2016), 1–12.Google Scholar
    80. Chenglei Wu, Takaaki Shiratori, and Yaser Sheikh. 2018. Deep incremental learning for efficient high-fidelity face tracking. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1–12.Google ScholarDigital Library
    81. Zongze Wu, Yotam Nitzan, Eli Shechtman, and Dani Lischinski. 2021. StyleAlign: Analysis and Applications of Aligned StyleGAN Models. arXiv preprint arXiv:2110.11323 (2021).Google Scholar
    82. Shugo Yamaguchi, Shunsuke Saito, Koki Nagano, Yajie Zhao, Weikai Chen, Kyle Olszewski, Shigeo Morishima, and Hao Li. 2018. High-fidelity facial reflectance and geometry inference from an unconstrained image. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–14.Google ScholarDigital Library
    83. T Yenamandra, A Tewari, F Bernard, HP Seidel, M Elgharib, D Cremers, and C Theobalt. 2021. i3DMM: Deep Implicit 3D Morphable Model of Human Heads. In Proceedings of the IEEE / CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
    84. E. Zakharov, A. Shysheya, E. Burkov, and V. Lempitsky. 2019. Few-shot adversarial learning of realistic neural talking head models. In IEEE/CVF International Conference on Computer Vision. 9459–9468.Google Scholar
    85. Li Zhang, Noah Snavely, Brian Curless, and Steven M. Seitz. 2004. Spacetime Faces: High Resolution Capture for Modeling and Animation. ACM Trans. Graph. 23, 3 (2004), 548–558.Google ScholarDigital Library
    86. Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586–595.Google ScholarCross Ref
    87. Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. 2016. Loss functions for image restoration with neural networks. IEEE Transactions on computational imaging 3, 1 (2016), 47–57.Google ScholarCross Ref
    88. Michael Zollhöfer, Justus Thies, Pablo Garrido, Derek Bradley, Thabo Beeler, Patrick Pérez, Marc Stamminger, Matthias Nießner, and Christian Theobalt. 2018. State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications. Computer Graphics Forum (2018). Google ScholarCross Ref

ACM Digital Library Publication:

Overview Page: