Facial Performance Enhancement Using Dynamic Shape-Space Analysis

The facial performance of an individual is inherently rich in subtle deformation and timing details. Although these subtleties make the performance realistic and compelling, they often elude both motion capture and hand animation. We present a technique for adding fine-scale details and expressiveness to low-resolution art-directed facial performances, such as those created manually using a rig, via marker-based capture, by fitting a morphable model to a video, or through Kinect reconstruction using recent faceshift technology. We employ a high-resolution facial performance capture system to acquire a representative performance of an individual in which he or she explores the full range of facial expressiveness. From the captured data, our system extracts an expressiveness model that encodes subtle spatial and temporal deformation details specific to that particular individual. Once this model has been built, these details can be transferred to low-resolution art-directed performances. We demonstrate results on various forms of input; after our enhancement, the resulting animations exhibit the same nuances and fine spatial details as the captured performance, with optional temporal enhancement to match the dynamics of the actor. Finally, we show that our technique outperforms the current state-of-the-art in example-based facial animation.

References:

O. Alexander, M. Rogers, W. Lambeth, J.-Y. Chiang, W.-C. Ma, C.-C. Wang, and P. Debevec. 2010. The digital emily project: Achieving a photoreal digital actor. IEEE Comput. Graph. Appl. 30, 4, 20–31.
E. D. Andersen and K. D. Andersen. 2000. The mosek interior point optimizer for linear programming: An implementation of the homogeneous algorithm. In High Performance Optimization. Kluwer Academic Publishers, 197–232.
I. Baran, D. Vlasic, E. Grinspun, and J. Popovic. 2009. Semantic deformation transfer. ACM Trans. Graph. 28, 3, 36:1–36:6.
T. Beeler. B. Bickel, R. Sumner, P. Beardsley, and M. Gross. 2010. High-quality single-shot capture of facial geometry. ACM Trans. Graph. 29, 4.
T. Beeler, F. Hahn, D. Bradley, B. Bickel, P. Beardsley, C. Gotsman, R. W. Sumner, and M. Gross. 2011. High-quality passive facial performance capture using anchor frames. ACM Trans. Graph. 30, 75:1–75:10.
B. Bickel, M. Botsch, R. Angst, W. Matusik, M. Otaduy, H. Pfister, and M. Gross. 2007. Multi-scale capture of facial geometry and motion. ACM Trans. Graph. 26, 3.
B. Bickel, M. Lang, M. Botsch, M. A. Otaduy, and M. Gross. 2008. Pose-space animation and transfer of facial details. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 57–66.
V. Basso, C. Poggio, T. Blanz, and T. Vetter. 2003. Reanimating faces in images and video. Comput. Graph. Forum 22, 3, 641–650.
G. Borshukov, D. Piponi, O. Larsen J. Lewis, and C. Tempelaar-Lietz. 2003. Universal capture — Image-based facial animation for “the matrix reloaded”. In Proceedings of the ACM SIGGRAPH Sketches and Applications Conference.
M. Botsch, R. Sumner, M. Pauly, and M. Gross. 2006. Deformation transfer for detail-preserving surface editing. In Proceedings of the Workshop on Vision, Modeling and Visualization. 357–364.
D. Bradley, W. Heidrich, T. Popa, and A. Sheffer. 2010. High resolution passive facial performance capture. ACM Trans. Graph. 29, 4.
M. Brand. 1999. Voice puppetry. In Proceedings of the 26^th Annual Conference on Computer Graphics and Interactive Techniques. 21–28.
C. Bregler, M. Covell, and M. Slaney. 1997. Video rewrite: Driving visual speech with audio. In Proceedings of the Annual Conference on Computer Graphics (SIGGRAPH’97). 353–360.
I. Buck, A. Finkelstein, C. Jacobs, A. Klein, D. H. Salesin, J. Seims, R. Szeliski, and K. Toyama. 2000. Performance-driven hand-drawn animation. In Proceedings of the 1^st International Symposium on NonPhotorealistic Animation and Rendering (NPAR’00). 101–108.
Y. Cao, P. Faloutsos, E. Kohler, and F. Pighin. 2004. Realtime speech motion synthesis from recorded motions. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 345–353.
J. Chai and J. K. Hodgins. 2007. Constraint-based motion optimization using a statistical dynamic model. ACM Trans. Graph. 26, 3, 8:1–8:9.
J.-X. Chai, J. Xiao, and J. Hodgins. 2003. Vision-based control of 3d facial animation. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 193–206.
E. Chuang and C. Bregler. 2002. Performance driven facial animation using blendshape interpolation. Tech. rep. CS-TR-2002-02, Department of Computer Science, Stanford University.
K. Dale, K. Sunkavalli, M. K. Johnson, D. Vlasic, W. Matusik, and H. Pfister 2011. Video face replacement. In Proceedings of the SIGGRAPH Asia Conference (SA’11). 130:1–130:10.
D. Decarlo and D. Metaxas. 1996. The integration of optical flow and deformable models with applications to human face shape and motion estimation. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’96). 231–238.
Z. Deng, J. Lewis, and U. Neumann. 2005. Synthesizing speech animation by learning compact speech co-articulation models. In Proceedings of the Computer Graphics International (CGI’05). 19–25.
M. Desbrun, M. Meyer, P. Schroder, and A. H. Barr. 1999. Implicit fairing of irregular meshes using diffusion and curvature flow. In Proceedings of the Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’99). ACM Press/Addison-Wesley, 317–324.
P. Ekman and W. Friesen. 1978. The Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press.
I. Essa, S. Basu, T. Darrell, and A. Pentland. 1996. Modeling, tracking and interactive animation of faces and heads: Using input from video. In Proceedings of the Conference on Computer Animation (CA’96). 68–79.
T. Ezzat, G. Geiger, and T. Poggio. 2002. Trainable videorealistic speech animation. ACM Trans. Graph. 21, 3, 388-398.
W.-W. Feng, B.-U. Kim, and Y. Yu. 2008. Real-time data-driven deformation using kernel canonical correlation analysis. ACM Trans. Graph. 27, 3, 91:1–91:9.
A. Golovinskiy, W. Matusik, H. Pfister, S. Rusinkiewicz, and T. Funkhouser. 2006. A statistical model for synthesis of detailed facial geometry. ACM Trans. Graph. 25, 3, 1025–1034.
B. K. P. Horn. 1987. Closed-form solution of absolute orientation using unit quaternions. J. Optical Soc. Amer. A 4, 4, 629–642.
H. Huang, J. Chai, X. Tong, and H.-T. Wu. 2011a. Leveraging motion capture and 3d scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. 30, 4, 74:1–74:10.
H. Huang, L. Zhao, K. Yin, Y. Qi, Y. Yu, and X. Tong. 2011b. Controllable hand deformation from sparse examples with rich details. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation. ACM Press, New York, 73–82.
A. Jones, A. Gardner, M. Bolas, I. Mcdowall, and P. Debevec. 2006. Performance geometry capture for spatially varying relighting. In Proceedings of the 3^rd European Conference on Visual Media Production (CVMP’06).
S. Kshirsagar and N. M. Thalmann. 2003. Visyllable based speech animation. Comput. Graph. Forum 22, 3.
J. Lewis, M. Cordner, and N. Fong. 2000. Pose space deformation: A unified approach to shape interpolation and skeleton-driven deformation. In Proceedings of the 27^th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’00). 165–172.
J. Lewis, J. Mooser, Z. Deng, and U. Neumann. 2005. Reducing blendshape interference by selected motion attenuation. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D’05).
H. Li, P. Roivainen, and R. Forchheimer. 1993. 3-D motion estimation in model-based facial image coding. IEEE Trans. Pattern Anal. Mach. Intell. 15, 6, 545–555.
H. LI, R. W. Sumner, and M. Pauly. 2008. Global correspondence optimization for non-rigid registration of depth scans. Comput. Graph. Forum 27, 5.
J. Ma, R. Cole, B. Pellom, W. Ward, and B. Wise. 2004. Accurate automatic visible speech synthesis of arbitrary 3d model based on concatenation of diviseme motion capture data. Comput. Anim. Virtual Worlds 15, 1–17.
W.-C. Ma, T. Hawkins, P. Peers, C.-F. Chabert, M. Weiss, and P. Debevec. 2007. Rapid acquisition of specular and diffuse normal maps from polarized spherical gradient illumination. In Proceedings of the 18^th Eurographics Conference on Rendering Techniques (EGSR’07). 183–194.
W.-C. Ma, A. Jones, J.-Y. Chiang, T. Hawkins, S. Frederiksen, P. Peers, M. Vukovic, M. Ouhyoung, and P. Debevec. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. ACM Trans. Graph. 27, 5.
X. Ma, B. H. Le, and Z. Deng. 2009. Style learning and transferring for facial animation editing. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA’09). 123–132.
J.-Y. Noh, and U. Neumann. 2001. Expression cloning. In Proceedings of the Annual Conference on Computer Graphics (SIGGRAPH’01). 277–288.
F. I. Parke. 1974. A parametric model for human faces. Ph.D. thesis, University of Utah.
F. H. Pighin, R. Szeliski, and D. Salesin. 1999. Resynthesizing facial animation through 3d model-based tracking. In Proceedings of the 7^th IEEE International Conference on Computer Vision (ICCV’99). 143–150.
H. Pyun, Y. Kim, W. Chae, H. Kang, and S. Shin. 2003. An example-based approach for facial expression cloning. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 167–176.
Y. Seol, J. Seo, P. H. Kim, J. P. Lewis, and J. Noh. 2011. Artist friendly facial animation retargeting. ACM Trans. Graph. 30, 6.
E. Sifakis, I. Neverov, and R. Fedkiw. 2005. Automatic determination of facial muscle activations from sparse motion capture marker data. ACM Trans. Graph. 24, 3, 417–425.
O. Sorkine, D. Cohen-Or, Y. Lipman, M. Alexa, C. Rossl, and H.-P. Seidel. 2004. Laplacian surface editing. In Proceedings of the Eurographics/ACM SIGGRAPH Symposium on Geometry Processing (SGP’04). ACM Press, New York, 179–188.
B. Sumner and J. Popovic. 2004. Deformation transfer for triangle meshes. ACM Trans. Graph. 23, 3, 399–405.
K. Takayama, R. Schmidt, K. Singh, T. Igarashi, T. Boubekeur, and O. Sorkine. 2011. Geobrush: Interactive mesh geometry cloning. Comput. Graph. Forum 30, 2, 613–622.
J. R. Tena, F. D. L. Torre, and I. Matthews. 2011. Interactive region-based linear 3d face models. ACM Trans. Graph. 30, 4.
D. Terzopoulus and K. Waters. 1993. Analysis and synthesis of facial image sequences using physical and anatomical models. IEEE Trans. Pattern Anal. Mach. Intell. 14, 569–579.
K. Venkataraman, S. Lodha, and R. Raghavan. 2005. A kinematic-variational model for animating skin with wrinkles. Comput. Graph. 29, 5, 756–770.
D. Vlasic, M. Brand, H. Pfister, and J. Popovic. 2005. Face transfer with multilinear models. ACM Trans. Graph. 24, 3, 426–433.
Y. Wang, X. Huang, C.-S. Lee, S. Zhang, Z. Li, D. Samaras, D. Metaxas, A. Elgammal, and P. Huang. 2004. High resolution acquisition, learning and transfer of dynamic 3-d facial expressions. Comput. Graph. Forum 23, 3, 677–686.
K. Waters. 1987. A muscle model for animating three-dimensional facial expression. In Proceedings of the 14^th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’87). 17–24.
T. Weise, S. Bouaziz, H. Li, and M. Pauly. 2011. Realtime performance-based facial animation. ACM Trans. Graph. 30, 4.
A. Wenger, A. Gardner, C. Tchou, J. Unger, T. Hawkins, and P. Debevec. 2005. Performance relighting and reflectance transformation with time-multiplexed illumination. ACM Trans. Graph. 24, 3, 756–764.
T. Weyrich, W. Matusik, H. Pfister, B. Bickel, C. Donner, C. Tu, J. Mcandless, J. Lee, A. Ngan, H. W. Jensen, and M. Gross. 2006. Analysis of human faces using a measurement-based skin reflectance model. ACM Trans. Graph. 25, 3, 1013–1024.
L. Williams. 1990. Performance-driven facial animation. In Proceedings of the 17^th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’90). 235–242.
C. Wilson, A. Ghosh, P. Peers, J.-Y. Chiang, J. Busch, and P. Debevec 2010. Temporal upsampling of performance geometry using photometric alignment. Trans. Graph. 29, 2.
Y. Wu, P. Kalra, and N. Magnenat-Thalmann. 1996. Simulation of static and dynamic wrinkles of skin. In Proceedings of the Conference on Computer Animation (CA’96). 90–97.
L. Zhang, N. Snavely, B. Curless, and S. M. Seitz. 2004. Spacetime faces: High resolution capture for modeling and animation. ACM Trans. Graph. 23, 3, 548–558.
S. Zhang and P. Huang. 2006. High-resolution, real-time three-dimensional shape measurement. Optical Engin. 45, 12.
Y. Zhang and T. Sim. 2005. Realistic and efficient wrinkle simulation using an anatomy-based face model with adaptive refinement. In Proceedings of the Computer Graphics International (CGI’05). 3–10.

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2014: Technical Papers

“Facial Performance Enhancement Using Dynamic Shape-Space Analysis” by Bermano, Bradley, Beeler, Zund, Nowrouzezahrai, et al. …

Conference:

Type(s):

Title:

Session/Category Title: Faces

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: