“Towards automated corrections in video-driven animation transfer” by Serra and Moser
Conference:
Type(s):
Title:
- Towards automated corrections in video-driven animation transfer
Session/Category Title:
- ML in Production
Presenter(s)/Author(s):
Moderator(s):
Abstract:
We describe a novel method that improves Digital Domain’s hybrid video-driven animation transfer technique for facial motion capture. In this work, we automate the animation correction pass, otherwise done by artists, and accelerate the production cycle while minimizing subjectivity associated with matching the actor’s facial expressions to the CG character. We leverage our video-driven animation transfer model that produces images of the CG character matching the actor’s performance, by using those images as targets in a differentiable renderer optimization loop. Thus improving the model’s initially predicted geometry. Furthermore, lighting parameters are removed from the optimization by training light-invariant models with a simple augmentation strategy. The corrected animations can be used directly in shots or to fine-tune the base model, as done in the earlier approach. Validation tests confirmed our method’s efficacy, and it is now being integrated into Digital Domain’s facial motion capture workflow.
References:
[1] Stephen W. Bailey, Jérémy Riviere, Morten Mikkelsen, and James F. O’Brien. 2022. Monocular Facial Performance Capture Via Deep Expression Matching. Computer Graphics Forum 41, 8 (2022), 243–254.
[2] DD. 2025. Digital Domain’s Masquerade Offline Capture. Retrieved Feb 11, 2025 from https://digitaldomain.com/technology/masquerade-offline-capture/
[3] Seonghyeon Kim, Sunjin Jung, Kwanggyoon Seo, Roger Blanco i Ribera, and Junyong Noh. 2021. Deep Learning-Based Unsupervised Human Facial Retargeting. Computer Graphics Forum 40, 7 (2021), 45–55.
[4] Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Baris Gecer, Jiankang Deng, and Stefanos Zafeiriou. 2023. FitMe: Deep Photorealistic 3D Morphable Model Avatars. In IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR).
[5] Lucio Moser, Chinyu Chien, Mark Williams, Jose Serra, Darren Hendler, and Doug Roble. 2021. Semi-Supervised Video-Driven Facial Animation Transfer for Production. ACM Trans. Graph. 40, 6 (2021).
[6] Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, and Georgia Gkioxari. 2020. Accelerating 3D Deep Learning with PyTorch3D. arXiv:2007.08501 (2020).
[7] Jose Serra, Mark Williams, and Lucio Moser. 2022. Accelerating facial motion capture with video-driven animation transfer. In ACM SIGGRAPH 2022 Talks. New York, NY, USA, Article 19, 2 pages.


