“Stabilized real-time face tracking via a learned dynamic rigidity prior” – ACM SIGGRAPH HISTORY ARCHIVES

“Stabilized real-time face tracking via a learned dynamic rigidity prior”

  • 2018 SA Technical Papers_Cao_Stabilized real-time face tracking via a learned dynamic rigidity prior

Conference:


Type(s):


Title:

    Stabilized real-time face tracking via a learned dynamic rigidity prior

Session/Category Title:   Faces, faces, faces


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    Despite the popularity of real-time monocular face tracking systems in many successful applications, one overlooked problem with these systems is rigid instability. It occurs when the input facial motion can be explained by either head pose change or facial expression change, creating ambiguities that often lead to jittery and unstable rigid head poses under large expressions. Existing rigid stabilization methods either employ a heavy anatomically-motivated approach that are unsuitable for real-time applications, or utilize heuristic-based rules that can be problematic under certain expressions. We propose the first rigid stabilization method for real-time monocular face tracking using a dynamic rigidity prior learned from realistic datasets. The prior is defined on a region-based face model and provides dynamic region-based adaptivity for rigid pose optimization during real-time performance. We introduce an effective offline training scheme to learn the dynamic rigidity prior by optimizing the convergence of the rigid pose optimization to the ground-truth poses in the training data. Our real-time face tracking system is an optimization framework that alternates between rigid pose optimization and expression optimization. To ensure tracking accuracy, we combine both robust, drift-free facial landmarks and dense optical flow into the optimization objectives. We evaluate our system extensively against state-of-the-art monocular face tracking systems and achieve significant improvement in tracking accuracy on the high-quality face tracking benchmark. Our system can improve facial-performance-based applications such as facial animation retargeting and virtual face makeup with accurate expression and stable pose. We further validate the dynamic rigidity prior by comparing it against other variants on the tracking accuracy.

References:


    1. Sameer Agarwal, Keir Mierle, and Others. 2016. Ceres Solver. http://ceres-solver.org. (2016).Google Scholar
    2. Apple. 2017. Animoji. A new way to get into character. (2017). https://www.apple.com/iphone-x/#truedepth-cameraGoogle Scholar
    3. Thabo Beeler, Bernd Bickel, Paul Beardsley, Bob Sumner, and Markus Gross. 2010. High-quality Single-shot Capture of Facial Geometry. In ACM SIGGRAPH 2010 Papers (SIGGRAPH ’10). ACM, New York, NY, USA, Article 40, 9 pages. Google ScholarDigital Library
    4. Thabo Beeler and Derek Bradley. 2014. Rigid Stabilization of Facial Expressions. ACM Trans. Graph. 33, 4, Article 44 (July 2014), 9 pages. Google ScholarDigital Library
    5. Thabo Beeler, Fabian Hahn, Derek Bradley, Bernd Bickel, Paul Beardsley, Craig Gotsman, Robert W. Sumner, and Markus Gross. 2011. High-quality passive facial performance capture using anchor frames. ACM Trans. Graphics (Proc. SIGGRAPH) 30, Article 75 (2011), 75:1–75:10 pages. Issue 4. Google ScholarDigital Library
    6. V. Blanz and T. Vetter. 1999. A morphable model for the synthesis of 3D faces. In Proceedings of SIGGRAPH. 187–194. Google ScholarDigital Library
    7. Sofien Bouaziz, Yangang Wang, and Mark Pauly. 2013. Online modeling for realtime facial animation. ACM Trans. Graphics (Proc. SIGGRAPH) 32, 4, Article 40 (2013), 40:1–40:10 pages. Google ScholarDigital Library
    8. Derek Bradley, Wolfgang Heidrich, Tiberiu Popa, and Alla Sheffer. 2010. High Resolution Passive Facial Performance Capture. In ACM SIGGRAPH 2010 Papers (SIGGRAPH ’10). ACM, New York, NY, USA, Article 41, 10 pages. Google ScholarDigital Library
    9. Chen Cao, Derek Bradley, Kun Zhou, and Thabo Beeler. 2015. Real-time High-fidelity Facial Performance Capture. ACM Trans. Graph. 34, 4, Article 46 (July 2015), 9 pages. Google ScholarDigital Library
    10. Chen Cao, Qiming Hou, and Kun Zhou. 2014a. Displaced Dynamic Expression Regression for Real-time Facial Tracking and Animation. ACM Trans. Graphics (Proc. SIGGRAPH) 33, 4, Article 43 (2014), 43:1–43:10 pages. Google ScholarDigital Library
    11. Chen Cao, Yanlin Weng, Stephen Lin, and Kun Zhou. 2013. 3D shape regression for real-time facial animation. ACM Trans. Graphics (Proc. SIGGRAPH) 32, 4, Article 41 (2013), 41:1–41:10 pages. Google ScholarDigital Library
    12. Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2014b. FaceWarehouse: A 3D Facial Expression Database for Visual Computing. IEEE Transactions on Visualization and Computer Graphics 20, 3 (March 2014), 413–425. Google ScholarDigital Library
    13. Jin-Xiang Chai, Jing Xiao, and Jessica Hodgins. 2003. Vision-based Control of 3D Facial Animation. In SCA. Google ScholarDigital Library
    14. Yen-Lin Chen, Hsiang-Tao Wu, Fuhao Shi, Xin Tong, and Jinxiang Chai. 2013. Accurate and Robust 3D Facial Capture Using a Single RGBD Camera. In ICCV. Google ScholarDigital Library
    15. Yasutaka Furukawa and Jean Ponce. 2009. Dense 3D Motion Capture for Human Faces. In CVPR.Google Scholar
    16. Pablo Garrido, Levi Valgaerts, Chenglei Wu, and Christian Theobalt. 2013. Reconstructing Detailed Dynamic Face Geometry from Monocular Video. In ACM Trans. Graphics (Proc. SIGGRAPH Asia), Vol. 32. 158:1–158:10. Google ScholarDigital Library
    17. Pablo Garrido, Michael Zollhöfer, Dan Casas, Levi Valgaerts, Kiran Varanasi, Patrick Pérez, and Christian Theobalt. 2016a. Reconstruction of Personalized 3D Face Rigs from Monocular Video. ACM Trans. Graph. 35, 3, Article 28 (May 2016), 15 pages. Google ScholarDigital Library
    18. Pablo Garrido, Michael Zollhöfer, Chenglei Wu, Derek Bradley, Patrick Pérez, Thabo Beeler, and Christian Theobalt. 2016b. Corrective 3D Reconstruction of Lips from Monocular Video. ACM Trans. Graph. 35, 6, Article 219 (Nov. 2016), 11 pages. Google ScholarDigital Library
    19. Abhijeet Ghosh, Graham Fyffe, Borom Tunwattanapong, Jay Busch, Xueming Yu, and Paul Debevec. 2011. Multiview face capture using polarized spherical gradient illumination. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 30, 6, Article 129 (2011), 129:1–129:10 pages. Google ScholarDigital Library
    20. Haoda Huang, Jinxiang Chai, Xin Tong, and Hsiang-Tao Wu. 2011. Leveraging motion capture and 3D scanning for high-fidelity facial performance acquisition. ACM Trans. Graphics (Proc. SIGGRAPH) 30, 4, Article 74 (2011), 74:1–74:10 pages. Google ScholarDigital Library
    21. Pushkar Joshi, Wen C. Tien, Mathieu Desbrun, and Frédéric Pighin. 2003. Learning Controls for Blend Shape Based Realistic Facial Animation. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA ’03). Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, 187–192. http://dl.acm.org/citation.cfm?id=846276.846303 Google ScholarDigital Library
    22. V. Kazemi and J. Sullivan. 2014. One millisecond face alignment with an ensemble of regression trees. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. 1867–1874. Google ScholarDigital Library
    23. Martin Klaudiny and Adrian Hilton. 2012. High-detail 3D capture and non-sequential alignment of facial performance. In 3DIMPVT. Google ScholarDigital Library
    24. Till Kroeger, Radu Timofte, Dengxin Dai, editor=”Leibe Bastian Van Gool, Luc”, Jiri Matas, Nicu Sebe, and Max Welling. 2016. Fast Optical Flow Using Dense Inverse Search. Springer International Publishing, Cham, 471–488.Google Scholar
    25. Hao Li, Jihun Yu, Yuting Ye, and Chris Bregler. 2013. Realtime facial animation with on-the-fly correctives. ACM Trans. Graphics (Proc. SIGGRAPH) 32, 4, Article 42 (2013), 42:1–42:10 pages. Google ScholarDigital Library
    26. Wan-Chun Ma, Tim Hawkins, Pieter Peers, Charles-Felix Chabert, Malte Weiss, and Paul Debevec. 2007. Rapid Acquisition of Specular and Diffuse Normal Maps from Polarized Spherical Gradient Illumination. In Eurographics Symposium on Rendering. 183–194. Google ScholarDigital Library
    27. Thomas Neumann, Kiran Varanasi, Stephan Wenger, Markus Wacker, Marcus Magnor, and Christian Theobalt. 2013. Sparse Localized Deformation Components. ACM Trans. Graph. 32, 6, Article 179 (Nov. 2013), 10 pages. Google ScholarDigital Library
    28. Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. 2001. On Spectral Clustering: Analysis and an Algorithm. In Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS’01). MIT Press, Cambridge, MA, USA, 849–856. http://dl.acm.org/citation.cfm?id=2980539.2980649 Google ScholarDigital Library
    29. Julien Peyras, Adrien Bartoli, Hugo Mercier, and Patrice Dalle. 2007. Segmented AAMs Improve Person-Independent Face Fitting. In In BMVC’07 – Proceedings of the 18th British Machine Vision Conference.Google Scholar
    30. Taehyun Rhee, Youngkyoo Hwang, James Dokyoon Kim, and Changyeong Kim. 2011. Real-time Facial Animation from Live Video Tracking. In Proc. SCA. 215–224. Google ScholarDigital Library
    31. Fuhao Shi, Hsiang-Tao Wu, Xin Tong, and Jinxiang Chai. 2014a. Automatic Acquisition of High-fidelity Facial Performances Using Monocular Videos. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 33 (2014). Issue 6. Google ScholarDigital Library
    32. Fuhao Shi, Hsiang-Tao Wu, Xin Tong, and Jinxiang Chai. 2014b. Automatic Acquisition of High-fidelity Facial Performances Using Monocular Videos. ACM Trans. Graph. 33, 6, Article 222 (Nov. 2014), 13 pages. Google ScholarDigital Library
    33. Supasorn Suwajanakorn, Ira Kemelmacher-Shlizerman, and Steven M. Seitz. 2014. Total Moving Face Reconstruction. In ECCV.Google Scholar
    34. J. Rafael Tena, Fernando De la Torre, and Iain Matthews. 2011. Interactive Region-based Linear 3D Face Models. ACM Trans. Graph. 30, 4, Article 76 (July 2011), 10 pages. Google ScholarDigital Library
    35. Ayush Tewari, Michael Zollöfer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Perez, and Theobalt Christian. 2017. MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction. In The IEEE International Conference on Computer Vision (ICCV).Google Scholar
    36. J. Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, and M. Nießner. 2016. Face2Face: Real-Time Face Capture and Reenactment of RGB Videos. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2387–2395.Google Scholar
    37. L. Valgaerts, C. Wu, A. Bruhn, H.-P. Seidel, and C. Theobalt. 2012. Lightweight Binocular Facial Performance Capture under Uncontrolled Lighting. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 31, 6, Article 187 (2012). Google ScholarDigital Library
    38. Congyi Wang, Fuhao Shi, Shihong Xia, and Jinxiang Chai. 2016. Realtime 3D Eye Gaze Animation Using a Single RGB Camera. ACM Trans. Graph. 35, 4, Article 118 (July 2016), 14 pages. Google ScholarDigital Library
    39. Thibaut Weise, Sofien Bouaziz, Hao Li, and Mark Pauly. 2011. Realtime Performance-Based Facial Animation. ACM Trans. Graphics (Proc. SIGGRAPH) 30, 4 (2011), 77:1–77:10. Google ScholarDigital Library
    40. Thibaut Weise, Hao Li, Luc Van Gool, and Mark Pauly. 2009. Face/Off: live facial puppetry. In Proc. SCA. 7–16. Google ScholarDigital Library
    41. Chenglei Wu, Derek Bradley, Markus Gross, and Thabo Beeler. 2016. An Anatomically-constrained Local Deformation Model for Monocular Face Capture. ACM Trans. Graph. 35, 4, Article 115 (July 2016), 12 pages. Google ScholarDigital Library
    42. L. Zhang, N. Snavely, B. Curless, and S. M. Seitz. 2004. Spacetime faces: high resolution capture for modeling and animation. ACM Trans. Graph. 23, 3 (2004), 548–558. Google ScholarDigital Library
    43. Qingshan Zhang, Z. Liu, Gaining Quo, D. Terzopoulos, and Heung-Yeung Shum. 2006. Geometry-driven photorealistic facial expression synthesis. IEEE Transactions on Visualization and Computer Graphics 12, 1 (Jan 2006), 48–60. Google ScholarDigital Library


ACM Digital Library Publication:



Overview Page:



Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org