3D Gaussian Blendshapes for Head Avatar Animation

Shengjie Ma; Yanlin Weng; Tianjia Shao; Kun Zhou

“3D Gaussian Blendshapes for Head Avatar Animation”

Next: “3D Gaussian Splatting for Real-time Radiance... »

« Previous: “3D Galatea: Entry of three-dimensional moving...

Conference:

SIGGRAPH 2024

Type(s):

Technical Papers

Title:

3D Gaussian Blendshapes for Head Avatar Animation

Presenter(s)/Author(s):

Shengjie Ma

Yanlin Weng

Tianjia Shao

Kun Zhou

Abstract:

We introduce the 3D Gaussian blendshape representation for modeling photorealistic head avatars. The avatar model of an arbitrary expression can be effectively generated through linear blending of Gaussian blendshapes with the expression coefficients. Compared to state-of-the-art methods, our method better captures high-frequency details and achieves superior animation performance (370fps).

References:

[1]
Ziqian Bai, Zhaopeng Cui, Xiaoming Liu, and Ping Tan. 2021. Riggable 3D Face Reconstruction via In-Network Optimization. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. Computer Vision Foundation / IEEE, 6216?6225. https://doi.org/10.1109/CVPR46437.2021.00615

[2]
Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1999, Los Angeles, CA, USA, August 8-13, 1999, Warren N. Waggenspack (Ed.). ACM, 187?194. https://dl.acm.org/citation.cfm?id=311556

[3]
John C. Bowers, Rui Wang, Li-Yi Wei, and David Maletz. 2010. Parallel Poisson disk sampling with spectrum analysis on surfaces. ACM Trans. Graph. 29, 6 (2010), 166. https://doi.org/10.1145/1882261.1866188

[4]
Chen Cao, Qiming Hou, and Kun Zhou. 2014a. Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans. Graph. 33, 4, Article 43 (jul 2014), 10 pages. https://doi.org/10.1145/2601097.2601204

[5]
Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2014b. FaceWarehouse: A 3D Facial Expression Database for Visual Computing. IEEE Trans. Vis. Comput. Graph. 20, 3 (2014), 413?425. https://doi.org/10.1109/TVCG.2013.249

[6]
Chen Cao, Hongzhi Wu, Yanlin Weng, Tianjia Shao, and Kun Zhou. 2016. Real-time facial animation with image-based dynamic avatars. ACM Trans. Graph. 35, 4 (2016), 126:1?126:12. https://doi.org/10.1145/2897824.2925873

[7]
Bindita Chaudhuri, Noranart Vesdapunt, Linda G. Shapiro, and Baoyuan Wang. 2020. Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting. In Computer Vision – ECCV 2020 – 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part V(Lecture Notes in Computer Science, Vol. 12350), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 142?160. https://doi.org/10.1007/978-3-030-58558-7_9

[8]
Yufan Chen, Lizhen Wang, Qijing Li, Hongjiang Xiao, Shengping Zhang, Hongxun Yao, and Yebin Liu. 2023. MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar. arxiv:2312.04558 [cs.CV]

[9]
Helisa Dhamo, Yinyu Nie, Arthur Moreau, Jifei Song, Richard Shaw, Yiren Zhou, and Eduardo P?rez-Pellitero. 2023. HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting. arxiv:2312.02902 [cs.CV]

[10]
Yao Feng, Haiwen Feng, Michael J. Black, and Timo Bolkart. 2021. Learning an animatable detailed 3D face model from in-the-wild images. ACM Trans. Graph. 40, 4 (2021), 88:1?88:13. https://doi.org/10.1145/3450626.3459936

[11]
Guy Gafni, Justus Thies, Michael Zollhofer, and Matthias Nie?ner. 2021. Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8649?8658.

[12]
Xuan Gao, Chenglai Zhong, Jun Xiang, Yang Hong, Yudong Guo, and Juyong Zhang. 2022. Reconstructing personalized semantic facial nerf models from monocular video. ACM Transactions on Graphics (TOG) 41, 6 (2022), 1?12.

[13]
Pablo Garrido, Michael Zollh?fer, Dan Casas, Levi Valgaerts, Kiran Varanasi, Patrick P?rez, and Christian Theobalt. 2016. Reconstruction of Personalized 3D Face Rigs from Monocular Video. ACM Trans. Graph. 35, 3 (2016), 28:1?28:15. https://doi.org/10.1145/2890493

[14]
Philip-William Grassal, Malte Prinzler, Titus Leistner, Carsten Rother, Matthias Nie?ner, and Justus Thies. 2022. Neural head avatars from monocular rgb videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18653?18664.

[15]
Yang Hong, Bo Peng, Haiyao Xiao, Ligang Liu, and Juyong Zhang. 2022. Headnerf: A real-time nerf-based parametric head model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20374?20384.

[16]
Liwen Hu, Shunsuke Saito, Lingyu Wei, Koki Nagano, Jaewoo Seo, Jens Fursund, Iman Sadeghi, Carrie Sun, Yen-Chun Chen, and Hao Li. 2017. Avatar digitization from a single image for real-time rendering. ACM Trans. Graph. 36, 6 (2017), 195:1?195:14. https://doi.org/10.1145/3130800.31310887

[17]
Alexandru Eugen Ichim, Sofien Bouaziz, and Mark Pauly. 2015. Dynamic 3D avatar creation from hand-held video input. ACM Trans. Graph. 34, 4 (2015), 45:1?45:14. https://doi.org/10.1145/2766974

[18]
Boyi Jiang, Yang Hong, Hujun Bao, and Juyong Zhang. 2022. SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. IEEE, 5595?5605. https://doi.org/10.1109/CVPR52688.2022.00552

[19]
Bernhard Kerbl, Georgios Kopanas, Thomas Leimk?hler, and George Drettakis. 2023. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions on Graphics 42, 4 (July 2023). https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/

[20]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6980

[21]
J. P. Lewis, Ken Anjyo, Taehyun Rhee, Mengjie Zhang, Fred Pighin, and Zhigang Deng. 2014. Practice and Theory of Blendshape Facial Models. In Eurographics 2014 – State of the Art Reports, Sylvain Lefebvre and Michela Spagnuolo (Eds.). The Eurographics Association. https://doi.org/10.2312/egst.20141042

[22]
Tianye Li, Timo Bolkart, Michael J. Black, Hao Li, and Javier Romero. 2017. Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36, 6 (2017), 194:1?194:17. https://doi.org/10.1145/3130800.3130813

[23]
Stephen Lombardi, Tomas Simon, Gabriel Schwartz, Michael Zollhoefer, Yaser Sheikh, and Jason Saragih. 2021. Mixture of volumetric primitives for efficient neural rendering. ACM Transactions on Graphics (ToG) 40, 4 (2021), 1?13.

[24]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In Computer Vision – ECCV 2020 – 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part I(Lecture Notes in Computer Science, Vol. 12346), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 405?421. https://doi.org/10.1007/978-3-030-58452-8_24

[25]
Thomas M?ller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG) 41, 4 (2022), 1?15.

[26]
Stylianos Ploumpis, Evangelos Ververas, Eimear O? Sullivan, Stylianos Moschoglou, Haoyang Wang, Nick E. Pears, William A. P. Smith, Baris Gecer, and Stefanos Zafeiriou. 2021. Towards a Complete 3D Morphable Model of the Human Head. IEEE Trans. Pattern Anal. Mach. Intell. 43, 11 (2021), 4142?4160. https://doi.org/10.1109/TPAMI.2020.2991150

[27]
Shenhan Qian, Tobias Kirschstein, Liam Schoneveld, Davide Davoli, Simon Giebenhain, and Matthias Nie?ner. 2023. GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians. arxiv:2312.02069 [cs.CV]

[28]
Shunsuke Saito, Gabriel Schwartz, Tomas Simon, Junxuan Li, and Giljoo Nam. 2023. Relightable Gaussian Codec Avatars. arxiv:2312.03704 [cs.GR]

[29]
Ken Shoemake and Tom Duff. 1992. Matrix animation and polar decomposition. In Proceedings of the conference on Graphics interface, Vol. 92. 258?264.

[30]
Robert W Sumner and Jovan Popovi?. 2004. Deformation transfer for triangle meshes. ACM Transactions on graphics (TOG) 23, 3 (2004), 399?405.

[31]
Luan Tran and Xiaoming Liu. 2018. Nonlinear 3D Face Morphable Model. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society, 7346?7355. https://doi.org/10.1109/CVPR.2018.00767

[32]
Jie Wang, Jiu-Cheng Xie, Xianyan Li, Feng Xu, Chi-Man Pun, and Hao Gao. 2023. GaussianHead: High-fidelity Head Avatars with Learnable Gaussian Derivation. arxiv:arXiv:2312.01632 [cs.CV]

[33]
Yanlin Weng, Chen Cao, Qiming Hou, and Kun Zhou. 2014. Real-time facial animation on mobile devices. Graphical Models 76, 3 (2014), 172?179. https://doi.org/10.1016/j.gmod.2013.10.002 Computational Visual Media Conference 2013.

[34]
Jun Xiang, Xuan Gao, Yudong Guo, and Juyong Zhang. 2023. FlashAvatar: High-Fidelity Digital Avatar Rendering at 300FPS. arxiv:2312.02214 [cs.CV]

[35]
Tianhan Xu, Yasuhiro Fujita, and Eiichi Matsumoto. 2022. Surface-aligned neural radiance fields for controllable 3d human synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15883?15892.

[36]
Yuelang Xu, Benwang Chen, Zhe Li, Hongwen Zhang, Lizhen Wang, Zerong Zheng, and Yebin Liu. 2023a. Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians. arxiv:2312.03029 [cs.CV]

[37]
Yuelang Xu, Lizhen Wang, Xiaochen Zhao, Hongwen Zhang, and Yebin Liu. 2023b. Avatarmav: Fast 3d head avatar reconstruction using motion-aware neural voxels. In ACM SIGGRAPH 2023 Conference Proceedings. 1?10.

[38]
Yuelang Xu, Hongwen Zhang, Lizhen Wang, Xiaochen Zhao, Huang Han, Qi Guojun, and Yebin Liu. 2023c. LatentAvatar: Learning Latent Expression Code for Expressive Neural Head Avatar. In ACM SIGGRAPH 2023 Conference Proceedings.

[39]
Haotian Yang, Hao Zhu, Yanru Wang, Mingkai Huang, Qiu Shen, Ruigang Yang, and Xun Cao. 2020. FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020. Computer Vision Foundation / IEEE, 598?607. https://doi.org/10.1109/CVPR42600.2020.00068

[40]
Tarun Yenamandra, Ayush Tewari, Florian Bernard, Hans-Peter Seidel, Mohamed Elgharib, Daniel Cremers, and Christian Theobalt. 2021. i3DMM: Deep Implicit 3D Morphable Model of Human Heads. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. Computer Vision Foundation / IEEE, 12803?12813. https://doi.org/10.1109/CVPR46437.2021.01261

[41]
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586?595.

[42]
Yufeng Zheng, Victoria Fern?ndez Abrevaya, Marcel C B?hler, Xu Chen, Michael J Black, and Otmar Hilliges. 2022. Im avatar: Implicit morphable head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13545?13555.

[43]
Yufeng Zheng, Wang Yifan, Gordon Wetzstein, Michael J Black, and Otmar Hilliges. 2023. Pointavatar: Deformable point-based head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21057?21067.

[44]
Wojciech Zielonka, Timo Bolkart, and Justus Thies. 2022. Towards Metrical Reconstruction of Human Faces. In European Conference on Computer Vision. https://api.semanticscholar.org/CorpusID:248177832

[45]
Wojciech Zielonka, Timo Bolkart, and Justus Thies. 2023. Instant volumetric head avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4574?4584.

ACM Digital Library Publication:

3D Gaussian Blendshapes for Head Avatar Animation

Overview Page:

SIGGRAPH 2024: Technical Papers

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES