“Accurate markerless jaw tracking for facial performance capture” by Zoss, Beeler, Gross and Bradley

  • ©Gaspard Zoss, Thabo Beeler, Markus Gross, and Derek Bradley




    Accurate markerless jaw tracking for facial performance capture


Session Title: Human Capture and Modeling


    We present the first method to accurately track the invisible jaw based solely on the visible skin surface, without the need for any markers or augmentation of the actor. As such, the method can readily be integrated with off-the-shelf facial performance capture systems. The core idea is to learn a non-linear mapping from the skin deformation to the underlying jaw motion on a dataset where ground-truth jaw poses have been acquired, and then to retarget the mapping to new subjects. Solving for the jaw pose plays a central role in visual effects pipelines, since accurate jaw motion is required when retargeting to fantasy characters and for physical simulation. Currently, this task is performed mostly manually to achieve the desired level of accuracy, and the presented method has the potential to fully automate this labour intense and error prone process.


    1. Sameer Agarwal, Keir Mierle, and Others. 2016. Ceres Solver, http://ceres-solver.org.Google Scholar
    2. Eiichi Bando, Keisuke Nishigawa, Masanori Nakano, Hisahiro Takeuchi, Shuji Shigemoto, Kazuo Okura, Toyoko Satsuma, and Takeshi Yamamoto. 2009. Current status of researches on jaw movement and occlusion for clinical application. Japanese Dental Science Review 45, 2 (2009), 83–97.Google ScholarCross Ref
    3. Thabo Beeler and Derek Bradley. 2014. Rigid stabilization of facial expressions. ACM Transactions on Graphics 33, 4 (2014), 1–9. Google ScholarDigital Library
    4. Thabo Beeler, Fabian Hahn, Derek Bradley, Bernd Bickel, Paul Beardsley, Craig Gotsman, Robert W. Sumner, and Markus Gross. 2011. High-quality passive facial performance capture using anchor frames. ACM Transactions on Graphics (2011), 1. arXiv:arXiv:1011.1669v3 Google ScholarDigital Library
    5. Enrique Bermejo, Carmen Campomanes-Álvarez, Andrea Valsecchi, Oscar Ibáñez, Sergio Damas, and Oscar Cordón. 2017. Genetic algorithms for skull-face overlay including mandible articulation. Information Sciences 420 (2017), 200–217. Google ScholarDigital Library
    6. Sofien Bouaziz, Yangang Wang, and Mark Pauly. 2013. Online modeling for realtime facial animation. ACM Transactions on Graphics 32, 4 (2013), 40:1–40:10. Google ScholarDigital Library
    7. Derek Bradley, Wolfgang Heidrich, Tiberiu Popa, and Alia Sheffer. 2010. High Resolution Passive Facial Performance Capture. ACM Transactions on Graphics 29, 4 (2010), 41:1–41:10. Google ScholarDigital Library
    8. P. H. Buschang, H. Hayasaki, and G. S. Throckmorton. 2000. Quantification of human chewing-cycle kinematics. Archives of Oral Biology 45, 6 (2000), 461–474.Google ScholarCross Ref
    9. Peter H. Buschang, Gaylord S. Throckmorton, Dawn Austin, and Ana M. Wintergerst. 2007. Chewing cycle kinematics of subjects with deepbite malocclusion. American Journal of Orthodontics and Dentofacial Orthopedics 131, 5 (2007), 627–634.Google ScholarCross Ref
    10. Chen Cao, Derek Bradley, Kun Zhou, and Thabo Beeler. 2015. Real-time high-fidelity facial performance capture. ACM Transactions on Graphics 34, 4 (2015), 46:1–46:9. Google ScholarDigital Library
    11. Chen Cao, Qiming Hou, and Kun Zhou. 2014. Displaced Dynamic Expression Regression for Real-time Facial Tracking and Animation. ACM Trans. Graph. 33, 4 (2014), 43:1–43:10. Google ScholarDigital Library
    12. P. O. Eriksson, B. Häggman-Henrikson, E. Nordh, and H. Zafar. 2000. Co-ordinated mandibular and head-neck movements during rhythmic jaw activities in man. Journal of Dental Research 79, 6 (2000), 1378–1384.Google ScholarCross Ref
    13. Virgilio F. Ferrario, Chiarella Sforza, Nicola Lovecchio, and Fabrizio Mian. 2005. Quantification of translational and gliding components in human temporomandibular joint during mouth opening. Archives of Oral Biology 50, 5 (2005), 507–515.Google ScholarCross Ref
    14. Graham Fyffe, Tim Hawkins, Chris Watts, Wan-Chun Ma, and Paul Debevec. 2011. Comprehensive Facial Performance Capture. In Eurographics.Google Scholar
    15. G Fyffe, K Nagano, L Huynh, S Saito, J Busch, A Jones, H Li, and P Debevec. 2017. Multi-View Stereo on Consistent Face Topology. Comput. Graph. Forum 36, 2 (2017), 295–309. Google ScholarDigital Library
    16. Pablo Garrido, Levi Valgaerts, Chenglei Wu, and Christian Theobalt. 2013. Reconstructing Detailed Dynamic Face Geometry from Monocular Video. In {ACM} Trans. Graph. (Proceedings of SIGGRAPH Asia 2013), Vol. 32. 158:1–158:10. Google ScholarDigital Library
    17. John C Gower. 1975. Generalized procrustes analysis. Psychometrika 40, 1 (1975), 33–51.Google ScholarCross Ref
    18. Pei-Lun Hsieh, Chongyang Ma, Jihun Yu, and Hao Li. 2015. Unconstrained Realtime Facial Performance Capture. In Computer Vision and Pattern Recognition (CVPR).Google Scholar
    19. Davis E King. 2009. Dlib-ml: A Machine Learning Toolkit. Journal of Machine Learning Research 10 (2009), 1755–1758. Google ScholarDigital Library
    20. Soichiro Kinuta, Kazumichi Wakabayashi, Taiji Sohmura, Tetsuya Kojima, Takahiro Mizumori, Takashi Nakamura, Junzo Takahashi, and Hirofumi Yatani. 2005. Measurement of Masticatory Movement by a New Jaw Tracking System Using a Home Digital Camcorder. Dental Materials Journal 24, 4 (2005), 661–666.Google ScholarCross Ref
    21. Samuli Laine, Tero Karras, Timo Aila, Antti Herva, Shunsuke Saito, Ronald Yu, Hao Li, and Jaakko Lehtinen. 2017. Production-level Facial Performance Capture Using Deep Convolutional Neural Networks. In Proc. SCA. 10:1–10:10. Google ScholarDigital Library
    22. Hao Li, Jihun Yu, Yuting Ye, and Chris Bregler. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics 32, 4 (2013), 42:1–42:10. arXiv:1111.6189vl Google ScholarDigital Library
    23. Naser Mostashiri, Jaspreet Dhupia, Alexander Verl, and Weiliang Xu. 2018. A Novel Spatial Mandibular Motion-Capture System Based on Planar Fiducial Markers. IEEE Sensors Journal 18, 24 (2018), 10096–10104.Google ScholarCross Ref
    24. M. G. Piancino, T. Vallelonga, C. Debernardi, and P. Bracco. 2013. Deep bite: A case report with chewing pattern and electromyographic activity before and after therapy with function generating bite. European Journal of Paediatric Dentistry 14, 2 (2013), 156–159.Google Scholar
    25. AI P Pinheiro, A O Andrade, A A Pereira, and D Bellomo. 2008. A computational method for recording and analysis of mandibular movements. Journal of applied oral science : revista FOB 16, 5 (2008), 321–7.Google ScholarCross Ref
    26. J. F. Prinz. 1997. The cybermouse: A simple method of describing the trajectory of the human mandible in three dimensions. Journal of Biomechanics 30, 6 (1997), 643–645.Google ScholarCross Ref
    27. Isa C T Santos, João Manuel R S Tavares, Joaquim G Mendes, and Manuel P F Paulo. 2006. A System for Analysis of the 3D Mandibular Movement using Magnetic Sensors and Neuronal Networks. Proceedings of the 2nd International Workshop on Artificial Neural Networks and Intelligent Information Processing 2006 (2006), 54–63.Google Scholar
    28. Fuhao Shi, Hsiang-Tao Wu, Xin Tong, and Jinxiang Chai. 2014. Automatic Acquisition of High-fidelity Facial Performances Using Monocular Videos. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2014) 33, 6 (2014). Google ScholarDigital Library
    29. Supasorn Suwajanakorn, Ira Kemelmacher-Shlizerman, and Steven M Seitz. 2014. Total Moving Face Reconstruction. In ECCV.Google Scholar
    30. Yuto Tanaka, Takafumi Yamada, Yoshinobu Maeda, and Kazunori Ikebe. 2016. Markerless three-dimensional tracking of masticatory movement. Journal of Biomechanics 49, 3 (2016), 442–449.Google ScholarCross Ref
    31. Ayush Tewari, Michael Zollöfer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Perez, and Christian Theobalt. 2017. MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction. In Proc. of IEEE ICCV.Google Scholar
    32. Justus Thies, Michael Zollhöfer, Matthias Nießner, Levi Valgaerts, Marc Stamminger, and Christian Theobalt. 2015. Real-time Expression Transfer for Facial Reenactment. ACM Trans. Graph. 34, 6 (2015), 183:1–183:14. Google ScholarDigital Library
    33. J Thies, M Zollhöfer, M Stamminger, C Theobalt, and M Nießner. 2016. Face2Face: Real-time Face Capture and Reenactment of RGB Videos. In Proc. of IEEE CVPR.Google ScholarDigital Library
    34. J Thies, M Zollhöfer, M Stamminger, C Theobalt, and M Nießner. 2018. HeadOn: Realtime Reenactment of Human Portrait Videos. ACM Transactions on Graphics 2018 (TOG) (2018). Google ScholarDigital Library
    35. Levi Valgaerts, Chenglei Wu, Andrés Bruhn, Hans-Peter Seidel, and Christian Theobalt. 2012. Lightweight Binocular Facial Performance Capture under Uncontrolled Lighting. ACM Transactions on Graphics 31, 6 (2012), 187:1–187:11. Google ScholarDigital Library
    36. Thibaut Weise, Sofien Bouaziz, Hao Li, and Mark Pauly. 2011. Realtime Performance-Based Facial Animation. ACM Trans. Graphics (Proc. SIGGRAPH) 30, 4 (2011), 77:1–77:10. Google ScholarDigital Library
    37. B. Wiesinger, B. Häggman-Henrikson, A. Wänman, M. Lindkvist, and F. Hellström. 2014. Jaw-opening accuracy is not affected by masseter muscle vibration in healthy men. Experimental Brain Research 232, 11 (2014), 3501–3508.Google ScholarCross Ref
    38. Erin M Wilson and Gary Weismer. 2012. Motion for Early Chewing : Preliminary Findings. Journal of Speech, Language, and Hearing Research 55, 2 (2012), 626–638.Google ScholarCross Ref
    39. A. M. Wintergerst, P. H. Buschang, and G. S. Throckmorton. 2004. Reducing within-subject variation in chewing cycle kinematics – A statistical approach. Archives of Oral Biology 49, 12 (2004), 991–1000.Google ScholarCross Ref
    40. Chenglei Wu, Derek Bradley, Markus Gross, and Thabo Beeler. 2016. An anatomically-constrained local deformation model for monocular face capture. ACM Transactions on Graphics 35, 4 (2016), 1–12. Google ScholarDigital Library
    41. Wenwu Yang, Nathan Marshak, Daniel Sýkora, Srikumar Ramalingam, and Ladislav Kavan. 2018. Building Anatomically Realistic Jaw Kinematics Model from Data. CoRR abs/1805.0 (2018). arXiv:1805.05903 http://arxiv.org/abs/1805.05903Google Scholar
    42. H. Zafar, P. O. Eriksson, E. Nordh, and B. Häggman-Henrikson. 2000. Wireless optoelectronic recordings of mandibular and associated head-neck movements in man: A methodological study. Journal of Oral Rehabilitation 27, 3 (2000), 227–238.Google ScholarCross Ref
    43. Gaspard Zoss, Derek Bradley, Pascal Bérard, and Thabo Beeler. 2018. An Empirical Rig for Jaw Animation. ACM Transactions on Graphics 37, 4 (2018), 59:1–59:12. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page: