“3D shape regression for real-time facial animation” by Cao, Weng, Lin and Zhou

  • ©Chen Cao, Yanlin Weng, Steve Lin, and Kun Zhou

Conference:


Type:


Title:

    3D shape regression for real-time facial animation

Session/Category Title: Faces & Hands


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    We present a real-time performance-driven facial animation system based on 3D shape regression. In this system, the 3D positions of facial landmark points are inferred by a regressor from 2D video frames of an ordinary web camera. From these 3D points, the pose and expressions of the face are recovered by fitting a user-specific blendshape model to them. The main technical contribution of this work is the 3D regression algorithm that learns an accurate, user-specific face alignment model from an easily acquired set of training data, generated from images of the user performing a sequence of predefined facial poses and expressions. Experiments show that our system can accurately recover 3D face shapes even for fast motions, non-frontal faces, and exaggerated expressions. In addition, some capacity to handle partial occlusions and changing lighting conditions is demonstrated.

References:


    1. Beeler, T., Bickel, B., Beardsley, P., Sumner, R., and Gross, M. 2010. High-quality single-shot capture of facial geometry. ACM Trans. Graph. 29, 4, 40:1–40:9. Google ScholarDigital Library
    2. Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM Trans. Graph. 30, 4, 75:1–75:10. Google ScholarDigital Library
    3. Besl, P., and McKay, H. 1992. A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14, 2, 239–256. Google ScholarDigital Library
    4. Bingham, E., and Mannila, H. 2001. Random projection in dimensionality reduction: Applications to image and text data. In Knowledge Discovery and Data Mining, 245–250. Google ScholarDigital Library
    5. Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In Proceedings of SIGGRAPH, 187–194. Google ScholarDigital Library
    6. Bradley, D., Heidrich, W., Popa, T., and Sheffer, A. 2010. High resolution passive facial performance capture. ACM Trans. Graph. 29, 4, 41:1–41:10. Google ScholarDigital Library
    7. Byrd, R. H., Lu, P., Nocedal, J., and Zhu, C. 1995. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16, 5 (Sept.), 1190–1208. Google ScholarDigital Library
    8. Cao, X., Wei, Y., Wen, F., and Sun, J. 2012. Face alignment by explicit shape regression. In IEEE CVPR, 2887–2894. Google ScholarDigital Library
    9. Cao, C., Weng, Y., Zhou, S., Tong, Y., and Zhou, K. 2013. FaceWarehouse: a 3D Facial Expression Database for Visual Computing. IEEE TVCG, under revision.Google Scholar
    10. Castelan, M., Smith, W. A., and Hancock, E. R. 2007. A coupled statistical model for face shape recovery from brightness images. IEEE Trans. Image Processing 16, 4, 1139–1151. Google ScholarDigital Library
    11. Chai, J.-X., Xiao, J., and Hodgins, J. 2003. Vision-based control of 3d facial animation. In Symp. Comp. Anim., 193–206. Google ScholarDigital Library
    12. Cootes, T. F., Ionita, M. C., Lindner, C., and Sauer, P. 2012. Robust and accurate shape model fitting using random forest regression voting. In ECCV, VII:278–291. Google ScholarDigital Library
    13. DeCarlo, D., and Metaxas, D. 2000. Optical flow constraints on deformable models with applications to face tracking. Int. Journal of Computer Vision 38, 2, 99–127. Google ScholarDigital Library
    14. Dementhon, D. F., and Davis, L. S. 1995. Model-based object pose in 25 lines of code. Int. J. Comput. Vision 15, 1–2, 123–141. Google ScholarDigital Library
    15. Dollar, P., Welinder, P., and Perona, P. 2010. Cascaded pose regression. In IEEE CVPR, 1078–1085.Google Scholar
    16. Ekman, P., and Friesen, W. 1978. Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press.Google Scholar
    17. Essa, I., Basu, S., Darrell, T., and Pentland, A. 1996. Modeling, tracking and interactive animation of faces and heads: Using input from video. In Computer Animation, 68–79. Google ScholarDigital Library
    18. Huang, D., and la Torre, F. D. 2012. Facial action transfer with personalized bilinear regression. In ECCV, II:144–158. Google ScholarDigital Library
    19. Huang, H., Chai, J., Tong, X., and Wu, H.-T. 2011. Leveraging motion capture and 3d scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. 30, 4, 74:1–74:10. Google ScholarDigital Library
    20. Kholgade, N., Matthews, I., and Sheikh, Y. 2011. Content retargeting using parameter-parallel facial layers. In Symp. Computer Animation, 195–204. Google ScholarDigital Library
    21. Lewis, J. P., and Anjyo, K. 2010. Direct manipulation blendshapes. IEEE CG&A 30, 4, 42–50. Google ScholarDigital Library
    22. Li, H., Weise, T., and Pauly, M. 2010. Example-based facial rigging. ACM Trans. Graph. 29, 4, 32:1–32:6. Google ScholarDigital Library
    23. Matthews, I., Xiao, J., and Baker, S. 2007. 2D vs. 3D deformable face models: Representational power, construction, and real-time fitting. Int. J. Computer Vision 75, 1, 93–113. Google ScholarDigital Library
    24. Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., and Salesin, D. H. 1998. Synthesizing realistic facial expressions from photographs. In Proceedings of SIGGRAPH, 75–84. Google ScholarDigital Library
    25. Pighin, F., Szeliski, R., and Salesin, D. 1999. Resynthesizing facial animation through 3d model-based tracking. In Int. Conf. Computer Vision, 143–150.Google Scholar
    26. Saragih, J., Lucey, S., and Cohn, J. 2011. Real-time avatar animation from a single image. In AFGR, 213–220.Google Scholar
    27. Seo, J., Irving, G., Lewis, J. P., and Noh, J. 2011. Compression and direct manipulation of complex blendshape models. ACM Trans. Graph. 30, 6. Google ScholarDigital Library
    28. Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models. ACM Trans. Graph. 24, 3, 426–433. Google ScholarDigital Library
    29. Weise, T., Li, H., Gool, L. V., and Pauly, M. 2009. Face/off: Live facial puppetry. In Symp. Computer Animation, 7–16. Google ScholarDigital Library
    30. Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. ACM Trans. Graph. 30, 4 (July), 77:1–77:10. Google ScholarDigital Library
    31. Williams, L. 1990. Performance driven facial animation. In Proceedings of SIGGRAPH, 235–242. Google ScholarDigital Library
    32. Xiao, J., Chai, J., and Kanade, T. 2006. A closed-form solution to non-rigid shape and motion recovery. Int. J. Computer Vision 67, 2, 233–246. Google ScholarDigital Library
    33. Yang, F., Wang, J., Shechtman, E., Bourdev, L., and Metaxas, D. 2011. Expression flow for 3D-aware face component transfer. ACM Trans. Graph. 30, 4, 60:1–60:10. Google ScholarDigital Library
    34. Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: high resolution capture for modeling and animation. ACM Trans. Graph. 23, 3, 548–558. Google ScholarDigital Library
    35. Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22, 11, 1330–1334. Google ScholarDigital Library


ACM Digital Library Publication: