Dynamic 3D avatar creation from hand-held video input

We present a complete pipeline for creating fully rigged, personalized 3D facial avatars from hand-held video. Our system faithfully recovers facial expression dynamics of the user by adapting a blendshape template to an image sequence of recorded expressions using an optimization that integrates feature tracking, optical flow, and shape from shading. Fine-scale details such as wrinkles are captured separately in normal maps and ambient occlusion maps. From this user- and expression-specific data, we learn a regressor for on-the-fly detail synthesis during animation to enhance the perceptual realism of the avatars. Our system demonstrates that the use of appropriate reconstruction priors yields compelling face rigs even with a minimalistic acquisition system and limited user assistance. This facilitates a range of new applications in computer animation and consumer-level online communication based on personalized avatars. We present realtime application demos to validate our method.

References:

1. Alexander, O., Rogers, M., Lambeth, W., Chiang, M., and Debevec, P. 2009. Creating a photoreal digital actor: The digital emily project. In Visual Media Production, 2009. CVMP’09. Conference for. Google ScholarDigital Library
2. Alexander, O., Fyffe, G., Busch, J., Yu, X., Ichikari, R., Jones, A., Debevec, P., Jimenez, J., Danvoye, E., Antionazzi, B., Eheler, M., Kysela, Z., and von der Pahlen, J. 2013. Digital ira: Creating a real-time photoreal digital actor. In ACM SIGGRAPH 2013 Posters. Google ScholarDigital Library
3. Amberg, B., Blake, A., Fitzgibbon, A. W., Romdhani, S., and Vetter, T. 2007. Reconstructing high quality face-surfaces using model based stereo. In ICCV.Google Scholar
4. Beeler, T., Bickel, B., Beardsley, P., Sumner, B., and Gross, M. 2010. High-quality single-shot capture of facial geometry. ACM Transactions on Graphics (TOG). Google ScholarDigital Library
5. Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM Trans. Graph.. Google ScholarDigital Library
6. Beeler, T., Bickel, B., Noris, G., Beardsley, P., Marschner, S., Sumner, R. W., and Gross, M. 2012. Coupled 3d reconstruction of sparse facial hair and skin. ACM Trans. Graph.. Google ScholarDigital Library
7. Bérard, P., Bradley, D., Nitti, M., Beeler, T., and Gross, M. 2014. High-quality capture of eyes. ACM Trans. Graph. 33, 6 (Nov.), 223:1–223:12. Google ScholarDigital Library
8. Bermano, A. H., Bradley, D., Beeler, T., Zünd, F., Nowrouzezahrai, D., Baran, I., Sorkine, O., Pfister, H., Sumner, R. W., Bickel, B., and Gross, M. 2014. Facial performance enhancement using dynamic shape space analysis. ACM Trans. Graph.. Google ScholarDigital Library
9. Bickel, B., Lang, M., Botsch, M., Otaduy, M. A., and Gross, M. H. 2008. Pose-space animation and transfer of facial details. In Symposium on Computer Animation. Google ScholarDigital Library
10. Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. Google ScholarDigital Library
11. Botsch, M., Kobbelt, L., Pauly, M., Alliez, P., and Levy, B. 2010. Polygon Mesh Processing. AK Peters.Google Scholar
12. Bouaziz, S., Wang, Y., and Pauly, M. 2013. Online modeling for realtime facial animation. ACM Trans. Graph.. Google ScholarDigital Library
13. Bouaziz, S., Tagliasacchi, A., and Pauly, M. 2014. Dynamic 2d/3d registration. Eurographics Tutorial.Google Scholar
14. Bunnell, M. 2005. Dynamic ambient occlusion and indirect lighting. Gpu gems.Google Scholar
15. Cao, X., Wei, Y., Wen, F., and Sun, J. 2012. Face alignment by explicit shape regression. In CVPR. Google ScholarDigital Library
16. Cao, C., Weng, Y., Lin, S., and Zhou, K. 2013. 3d shape regression for real-time facial animation. ACM Trans. Graph.. Google ScholarDigital Library
17. Cao, C., Hou, Q., and Zhou, K. 2014. Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans. Graph.. Google ScholarDigital Library
18. Cao, C., Weng, Y., Zhou, S., Tong, Y., and Zhou, K. 2014. Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics. Google ScholarDigital Library
19. Chai, M., Zheng, C., and Zhou, K. 2014. A reduced model for interactive hairs. ACM Transactions on Graphics (July). Google ScholarDigital Library
20. Chambolle, A., Caselles, V., Cremers, D., Novaga, M., and Pock, T. 2010. An introduction to total variation for image analysis. Theoretical foundations and numerical methods for sparse recovery 9, 263–340.Google Scholar
21. Chartrand, R., and Yin, W. 2008. Iteratively reweighted algorithms for compressive sensing. In Acoustics, speech and signal processing, 2008. ICASSP 2008. IEEE international conference on, IEEE, 3869–3872.Google Scholar
22. Duda, R. O., and Hart, P. E. 1972. Use of the hough transformation to detect lines and curves in pictures. Commun. ACM. Google ScholarDigital Library
23. Frolova, D., Simakov, D., and Basri, R. 2004. Accuracy of spherical harmonic approximations for images of lambertian objects under far and near lighting. In Computer Vision-ECCV 2004.Google Scholar
24. Fu, W. J. 1998. Penalized Regressions: The Bridge versus the Lasso. J. Comp. Graph. Stat..Google Scholar
25. Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell.. Google ScholarDigital Library
26. Garrido, P., Valgaerts, L., Wu, C., and Theobalt, C. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM Transactions on Graphics. Google ScholarDigital Library
27. Ghosh, A., Fyffe, G., Tunwattanapong, B., Busch, J., Yu, X., and Debevec, P. 2011. Multiview face capture using polarized spherical gradient illumination. In Proc. of ACM SIGGRAPH Asia. Google ScholarDigital Library
28. Gonzalez, R. C., and Woods, R. E. 2006. Digital Image Processing (3rd Edition). Prentice-Hall, Inc. Google ScholarDigital Library
29. Gray, R. M. 2006. Toeplitz and circulant matrices: A review. now publishers Inc. Google ScholarDigital Library
30. Hu, L., Ma, C., Luo, L., and Li, H. 2014. Robust hair capture using simulated examples. ACM Transactions on Graphics. Google ScholarDigital Library
31. Huang, H., Chai, J., Tong, X., and Wu, H.-T. 2011. Leveraging motion capture and 3d scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. (Proc. SIGGRAPH). Google ScholarDigital Library
32. Jimenez, J., Echevarria, J. I., Oat, C., and Gutierrez, D. 2011. GPU Pro 2. AK Peters Ltd., ch. Practical and Realistic Facial Wrinkles Animation.Google Scholar
33. Kemelmacher-Shlizerman, I., and Basri, R. 2011. 3d face reconstruction from a single image using a single reference face shape. Pattern Analysis and Machine Intelligence, IEEE Transactions on. Google ScholarDigital Library
34. Lewis, J. P., Anjyo, K., Rhee, T., Zhang, M., Pighin, F., and Deng, Z. 2014. Practice and Theory of Blendshape Facial Models. In EG – STARs.Google Scholar
35. Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graph.. Google ScholarDigital Library
36. Li, H., Yu, J., Ye, Y., and Bregler, C. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics. Google ScholarDigital Library
37. Li, J., Xu, W., Cheng, Z., Xu, K., and Klein, R. 2015. Lightweight wrinkle synthesis for 3d facial modeling and animation. Computer-Aided Design 58, 0, 117–122. Solid and Physical Modeling 2014.Google ScholarDigital Library
38. Ma, W.-C., Jones, A., Chiang, J.-Y., Hawkins, T., Frederiksen, S., Peers, P., Vukovic, M., Ouhyoung, M., and Debevec, P. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. Proc. of ACM SIGGRAPH Asia. Google ScholarDigital Library
39. Oat, C. 2007. Animated wrinkle maps. In ACM SIGGRAPH 2007 courses. Google ScholarDigital Library
40. Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graph.. Google ScholarDigital Library
41. Saragih, J. M., Lucey, S., and Cohn, J. F. 2009. Face alignment through subspace constrained mean-shifts. In Computer Vision, 2009 IEEE 12th International Conference on.Google Scholar
42. Saragih, J. M., Lucey, S., and Cohn, J. F. 2011. Deformable model fitting by regularized landmark mean-shift. Int. J. Comput. Vision. Google ScholarDigital Library
43. Shi, F., Wu, H.-T., Tong, X., and Chai, J. 2014. Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans. Graph. 33, 6 (Nov.), 222:1–222:13. Google ScholarDigital Library
44. Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. ACM Trans. Graph.. Google ScholarDigital Library
45. Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.-P., and Theobalt, C. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. Proc. of ACM SIGGRAPH Asia.Google Scholar
46. Venkataraman, K., Lodha, S., and Raghavan, R. 2005. A kinematic-variational model for animating skin with wrinkles. Computers & Graphics. Google ScholarDigital Library
47. Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models.Google Scholar
48. Weise, T., Li, H., Van Gool, L., and Pauly, M. 2009. Face/off: Live facial puppetry. ACM Trans. Graph..Google Scholar
49. Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. In ACM SIGGRAPH 2011 Papers. Google ScholarDigital Library
50. Wu, Y., Kalra, P., and Thalmann, N. M. 1996. Simulation of static and dynamic wrinkles of skin. In Proc. of IEEE Computer Animation. Google ScholarDigital Library
51. Wu, C., Zollhöfer, M., Niessner, M., Stamminger, M., Izadi, S., and Theobalt, C. 2014. Real-time shading-based refinement for consumer depth cameras. ACM Trans. Graph. 33, 6 (Nov.), 200:1–200:10. Google ScholarDigital Library
52. Wu, C. 2013. Towards linear-time incremental structure from motion. In 3D Vision, 2013 International Conference on. Google ScholarDigital Library
53. Zach, C., Pock, T., and Bischof, H. 2007. A duality based approach for realtime tv-l 1 optical flow. In Pattern Recognition. Springer, 214–223. Google ScholarDigital Library
54. Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: High-resolution capture for modeling and animation. In ACM Annual Conference on Computer Graphics.Google Scholar

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2015: Technical Papers

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES

“Dynamic 3D avatar creation from hand-held video input” by Ichim, Bouaziz and Pauly

Conference:

Type(s):

Title:

Session/Category Title: Face Reality

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Submit a story:

Sponsored by: