Video face replacement

We present a method for replacing facial performances in video. Our approach accounts for differences in identity, visual appearance, speech, and timing between source and target videos. Unlike prior work, it does not require substantial manual operation or complex acquisition hardware, only single-camera video. We use a 3D multilinear model to track the facial performance in both videos. Using the corresponding 3D geometry, we warp the source to the target face and retime the source to match the target performance. We then compute an optimal seam through the video volume that maintains temporal consistency in the final composite. We showcase the use of our method on a variety of examples and present the result of a user study that suggests our results are difficult to distinguish from real video footage.

References:

1. Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., and Cohen, M. 2004. Interactive digital photomontage. ACM Trans. Graphics (Proc. SIGGRAPH) 23, 3, 294–302. Google ScholarDigital Library
2. Alexander, O., Rogers, M., Lambeth, W., Chiang, M., and Debevec, P. 2009. The digital emily project: Photoreal facial modeling and animation. In ACM SIGGRAPH 2009 Courses, 12:1–15. Google ScholarDigital Library
3. Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, B., and Gross, M. 2011 (to appear). High-quality passive facial performance capture using anchor frames. ACM Trans. Graphics (Proc. SIGGRAPH) 3, 27, 75:1–10. Google ScholarDigital Library
4. Bickel, B., Botsch, M., Angst, R., Matusik, W., Otaduy, M., Pfister, H., and Gross, M. 2007. Multi-scale capture of facial geometry and motion. ACM Trans. Graphics (Proc. SIGGRAPH) 26, 3, 33:1–10. Google ScholarDigital Library
5. Bitouk, D., Kumar, N., Dhillon, S., Belhumeur, P., and Nayar, S. K. 2008. Face swapping: Automatically replacing faces in photographs. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3, 39:1–8. Google ScholarDigital Library
6. Blanz, V., Basso, C., Poggio, T., and Vetter, T. 2003. Reanimating faces in images and video. Computer Graphics Forum 22, 3, 641–650.Google ScholarCross Ref
7. Blanz, V., Scherbaum, K., Vetter, T., and Seidel, H.-P. 2004. Exchanging faces in images. Computer Graphics Forum (Proc. Eurographics) 23, 3, 669–676.Google ScholarCross Ref
8. Borshukov, G., Piponi, D., Larsen, O., Lewis, J., and Tempelaar-Lietz, C. 2003. Universal capture — Image-based facial animation for “The Matrix Reloaded”. In ACM SIG-GRAPH 2003 Sketches & Applications. Google ScholarDigital Library
9. Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Analysis and Machine Intelligence 23, 11, 1222–1239. Google ScholarDigital Library
10. Bradley, D., Heidrich, W., Popa, T., and Sheffer, A. 2010. High resolution passive facial performance capture. ACM Trans. Graphics (Proc. SIGGRAPH), 4, 41:1–10. Google ScholarDigital Library
11. Bregler, C., Covell, M., and Slaney, M. 1997. Video Rewrite: Driving visual speech with audio. In Proc. SIGGRAPH, 353–360. Google ScholarDigital Library
12. DeCarlo, D., and Metaxas, D. 1996. The integration of optical flow and deformable models with applications to human face shape and motion estimation. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 231–238. Google ScholarDigital Library
13. Essa, I., Basu, S., Darrell, T., and Pentland, A. 1996. Modeling, tracking and interactive animation of faces and heads: Using input from video. In Proc. Computer Animation, 68–79. Google ScholarDigital Library
14. Everingham, M., Sivic, J., and Zisserman, A. 2006. “Hello! My name is… Buffy” — automatic naming of characters in TV video. In Proc. British Machine Vision Conference (BMVC), 899–908.Google Scholar
15. Ezzat, T., Geiger, G., and Poggio, T. 2002. Trainable vide-orealistic speech animation. ACM Trans. Graphics (Proc. SIGGRAPH) 21, 3, 388–398. Google ScholarDigital Library
16. Farbman, Z., Hoffer, G., Lipman, Y., Cohen-Or, D., and Lischinski, D. 2009. Coordinates for instant image cloning. ACM Trans. Graphics (Proc. SIGGRAPH) 28, 3, 67:1–9. Google ScholarDigital Library
17. Flagg, M., Nakazawa, A., Zhang, Q., Kang, S. B., Ryu, Y. K., Essa, I., and Rehg, J. M. 2009. Human video textures. In Proc. Symp. Interactive 3D Graphics (I3D), 199–206. Google ScholarDigital Library
18. Guenter, B., Grimm, C., Wood, D., Malvar, H., and Pighin, F. 1998. Making faces. In Proc. SIGGRAPH, 55–66. Google ScholarDigital Library
19. Jain, A., Thormählen, T., Seidel, H.-P., and Theobalt, C. 2010. Moviereshape: Tracking and reshaping of humans in videos. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 29, 5, 148:1–10. Google ScholarDigital Library
20. Jia, J., Sun, J., Tang, C.-K., and Shum, H.-Y. 2006. Drag-and-drop pasting. ACM Trans. Graphics (Proc. SIGGRAPH) 25, 3, 631–637. Google ScholarDigital Library
21. Jones, A., Gardner, A., Bolas, M., McDowall, I., and Debevec, P. 2006. Simulating spatially varying lighting on a live performance. In Proc. European Conf. Visual Media Production (CVMP), 127–133.Google Scholar
22. Joshi, N., Matusik, W., Adelson, E. H., and Kriegman, D. J. 2010. Personal photo enhancement using example images. ACM Trans. Graphics 29, 2, 12:1–15. Google ScholarDigital Library
23. Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., and Seitz, S. M. 2010. Being John Malkovich. In Proc. European Conf. Computer Vision (ECCV), 341–353. Google ScholarDigital Library
24. Kwatra, V., Schödl, A., Essa, I., Turk, G., and Bobick, A. 2003. Graphcut textures: Image and video synthesis using graph cuts. ACM Trans. Graphics (Proc. SIGGRAPH) 22, 3, 277–286. Google ScholarDigital Library
25. Leyvand, T., Cohen-Or, D., Dror, G., and Lischinski, D. 2008. Data-driven enhancement of facial attractiveness. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3, 38:1–9. Google ScholarDigital Library
26. Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graphics (Proc. SIGGRAPH) 28, 5, 175:1–10. Google ScholarDigital Library
27. Ma, W.-C., Jones, A., Chiang, J.-Y., Hawkins, T., Frederiksen, S., Peers, P., Vukovic, M., Ouhyoung, M., and Debevec, P. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 27, 5, 121:1–10. Google ScholarDigital Library
28. Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graphics (Proc. SIGGRAPH) 22, 3, 313–318. Google ScholarDigital Library
29. Pighin, F. H., Szeliski, R., and Salesin, D. 1999. Resynthesizing facial animation through 3d model-based tracking. In Proc. IEEE Int. Conf. Computer Vision (ICCV), 143–150.Google Scholar
30. Rabiner, L., and Juang, B.-H. 1993. Fundamentals of speech recognition. Prentice-Hall, Inc., Upper Saddle River, NJ, USA. Google ScholarDigital Library
31. Robertson, B. 2009. What’s old is new again. Computer Graphics World 32, 1.Google Scholar
32. Singular Inversions Inc., 2011. FaceGen Modeller manual. www.facegen.com.Google Scholar
33. Sunkavalli, K., Johnson, M. K., Matusik, W., and Pfister, H. 2010. Multi-scale image harmonization. ACM Trans. Graphics (Proc. SIGGRAPH) 29, 4, 125:1–10. Google ScholarDigital Library
34. Viola, P. A., and Jones, M. J. 2001. Robust real-time face detection. In Proc. IEEE Int. Conf. Computer Vision (ICCV), 747–755.Google Scholar
35. Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models. ACM Trans. Graphics (Proc. SIGGRAPH) 24, 3, 426–433. Google ScholarDigital Library
36. Weise, T., Li, H., Gool, L. V., and Pauly, M. 2009. Face/Off: Live facial puppetry. In Proc. SIGGRAPH/Eurographics Symp. Computer Animation, 7–16. Google ScholarDigital Library
37. Williams, L. 1990. Performance-driven facial animation. Computer Graphics (Proc. SIGGRAPH) 24, 4, 235–242. Google ScholarDigital Library
38. Yang, F., Wang, J., Shechtman, E., Bourdev, L., and Metaxas, D. 2011. Expression flow for 3D-aware face component transfer. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3, 60:1–10. Google ScholarDigital Library
39. Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: High resolution capture for modeling and animation. ACM Trans. Graphics 23, 3, 548–558. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH Asia 2011: Technical Papers

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES

“Video face replacement”

Conference:

Type(s):

Title:

Session/Category Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Submit a story:

Sponsored by: