3D shape regression for real-time facial animation

We present a real-time performance-driven facial animation system based on 3D shape regression. In this system, the 3D positions of facial landmark points are inferred by a regressor from 2D video frames of an ordinary web camera. From these 3D points, the pose and expressions of the face are recovered by fitting a user-specific blendshape model to them. The main technical contribution of this work is the 3D regression algorithm that learns an accurate, user-specific face alignment model from an easily acquired set of training data, generated from images of the user performing a sequence of predefined facial poses and expressions. Experiments show that our system can accurately recover 3D face shapes even for fast motions, non-frontal faces, and exaggerated expressions. In addition, some capacity to handle partial occlusions and changing lighting conditions is demonstrated.

References:

1. Beeler, T., Bickel, B., Beardsley, P., Sumner, R., and Gross, M. 2010. High-quality single-shot capture of facial geometry. ACM Trans. Graph. 29, 4, 40:1–40:9. Google ScholarDigital Library
2. Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM Trans. Graph. 30, 4, 75:1–75:10. Google ScholarDigital Library
3. Besl, P., and McKay, H. 1992. A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14, 2, 239–256. Google ScholarDigital Library
4. Bingham, E., and Mannila, H. 2001. Random projection in dimensionality reduction: Applications to image and text data. In Knowledge Discovery and Data Mining, 245–250. Google ScholarDigital Library
5. Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In Proceedings of SIGGRAPH, 187–194. Google ScholarDigital Library
6. Bradley, D., Heidrich, W., Popa, T., and Sheffer, A. 2010. High resolution passive facial performance capture. ACM Trans. Graph. 29, 4, 41:1–41:10. Google ScholarDigital Library
7. Byrd, R. H., Lu, P., Nocedal, J., and Zhu, C. 1995. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16, 5 (Sept.), 1190–1208. Google ScholarDigital Library
8. Cao, X., Wei, Y., Wen, F., and Sun, J. 2012. Face alignment by explicit shape regression. In IEEE CVPR, 2887–2894. Google ScholarDigital Library
9. Cao, C., Weng, Y., Zhou, S., Tong, Y., and Zhou, K. 2013. FaceWarehouse: a 3D Facial Expression Database for Visual Computing. IEEE TVCG, under revision.Google Scholar
10. Castelan, M., Smith, W. A., and Hancock, E. R. 2007. A coupled statistical model for face shape recovery from brightness images. IEEE Trans. Image Processing 16, 4, 1139–1151. Google ScholarDigital Library
11. Chai, J.-X., Xiao, J., and Hodgins, J. 2003. Vision-based control of 3d facial animation. In Symp. Comp. Anim., 193–206. Google ScholarDigital Library
12. Cootes, T. F., Ionita, M. C., Lindner, C., and Sauer, P. 2012. Robust and accurate shape model fitting using random forest regression voting. In ECCV, VII:278–291. Google ScholarDigital Library
13. DeCarlo, D., and Metaxas, D. 2000. Optical flow constraints on deformable models with applications to face tracking. Int. Journal of Computer Vision 38, 2, 99–127. Google ScholarDigital Library
14. Dementhon, D. F., and Davis, L. S. 1995. Model-based object pose in 25 lines of code. Int. J. Comput. Vision 15, 1–2, 123–141. Google ScholarDigital Library
15. Dollar, P., Welinder, P., and Perona, P. 2010. Cascaded pose regression. In IEEE CVPR, 1078–1085.Google Scholar
16. Ekman, P., and Friesen, W. 1978. Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press.Google Scholar
17. Essa, I., Basu, S., Darrell, T., and Pentland, A. 1996. Modeling, tracking and interactive animation of faces and heads: Using input from video. In Computer Animation, 68–79. Google ScholarDigital Library
18. Huang, D., and la Torre, F. D. 2012. Facial action transfer with personalized bilinear regression. In ECCV, II:144–158. Google ScholarDigital Library
19. Huang, H., Chai, J., Tong, X., and Wu, H.-T. 2011. Leveraging motion capture and 3d scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. 30, 4, 74:1–74:10. Google ScholarDigital Library
20. Kholgade, N., Matthews, I., and Sheikh, Y. 2011. Content retargeting using parameter-parallel facial layers. In Symp. Computer Animation, 195–204. Google ScholarDigital Library
21. Lewis, J. P., and Anjyo, K. 2010. Direct manipulation blendshapes. IEEE CG&A 30, 4, 42–50. Google ScholarDigital Library
22. Li, H., Weise, T., and Pauly, M. 2010. Example-based facial rigging. ACM Trans. Graph. 29, 4, 32:1–32:6. Google ScholarDigital Library
23. Matthews, I., Xiao, J., and Baker, S. 2007. 2D vs. 3D deformable face models: Representational power, construction, and real-time fitting. Int. J. Computer Vision 75, 1, 93–113. Google ScholarDigital Library
24. Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., and Salesin, D. H. 1998. Synthesizing realistic facial expressions from photographs. In Proceedings of SIGGRAPH, 75–84. Google ScholarDigital Library
25. Pighin, F., Szeliski, R., and Salesin, D. 1999. Resynthesizing facial animation through 3d model-based tracking. In Int. Conf. Computer Vision, 143–150.Google Scholar
26. Saragih, J., Lucey, S., and Cohn, J. 2011. Real-time avatar animation from a single image. In AFGR, 213–220.Google Scholar
27. Seo, J., Irving, G., Lewis, J. P., and Noh, J. 2011. Compression and direct manipulation of complex blendshape models. ACM Trans. Graph. 30, 6. Google ScholarDigital Library
28. Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models. ACM Trans. Graph. 24, 3, 426–433. Google ScholarDigital Library
29. Weise, T., Li, H., Gool, L. V., and Pauly, M. 2009. Face/off: Live facial puppetry. In Symp. Computer Animation, 7–16. Google ScholarDigital Library
30. Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. ACM Trans. Graph. 30, 4 (July), 77:1–77:10. Google ScholarDigital Library
31. Williams, L. 1990. Performance driven facial animation. In Proceedings of SIGGRAPH, 235–242. Google ScholarDigital Library
32. Xiao, J., Chai, J., and Kanade, T. 2006. A closed-form solution to non-rigid shape and motion recovery. Int. J. Computer Vision 67, 2, 233–246. Google ScholarDigital Library
33. Yang, F., Wang, J., Shechtman, E., Bourdev, L., and Metaxas, D. 2011. Expression flow for 3D-aware face component transfer. ACM Trans. Graph. 30, 4, 60:1–60:10. Google ScholarDigital Library
34. Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: high resolution capture for modeling and animation. ACM Trans. Graph. 23, 3, 548–558. Google ScholarDigital Library
35. Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22, 11, 1330–1334. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2013: Technical Papers

“3D shape regression for real-time facial animation” by Cao, Weng, Lin and Zhou

Conference:

Type(s):

Title:

Session/Category Title: Faces & Hands

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: