“Dynamic 3D avatar creation from hand-held video input” by Ichim, Bouaziz and Pauly

  • ©Alexandru Eugen Ichim, Sofien Bouaziz, and Mark Pauly



Session Title:

    Face Reality


    Dynamic 3D avatar creation from hand-held video input




    We present a complete pipeline for creating fully rigged, personalized 3D facial avatars from hand-held video. Our system faithfully recovers facial expression dynamics of the user by adapting a blendshape template to an image sequence of recorded expressions using an optimization that integrates feature tracking, optical flow, and shape from shading. Fine-scale details such as wrinkles are captured separately in normal maps and ambient occlusion maps. From this user- and expression-specific data, we learn a regressor for on-the-fly detail synthesis during animation to enhance the perceptual realism of the avatars. Our system demonstrates that the use of appropriate reconstruction priors yields compelling face rigs even with a minimalistic acquisition system and limited user assistance. This facilitates a range of new applications in computer animation and consumer-level online communication based on personalized avatars. We present realtime application demos to validate our method.


    1. Alexander, O., Rogers, M., Lambeth, W., Chiang, M., and Debevec, P. 2009. Creating a photoreal digital actor: The digital emily project. In Visual Media Production, 2009. CVMP’09. Conference for. Google ScholarDigital Library
    2. Alexander, O., Fyffe, G., Busch, J., Yu, X., Ichikari, R., Jones, A., Debevec, P., Jimenez, J., Danvoye, E., Antionazzi, B., Eheler, M., Kysela, Z., and von der Pahlen, J. 2013. Digital ira: Creating a real-time photoreal digital actor. In ACM SIGGRAPH 2013 Posters. Google ScholarDigital Library
    3. Amberg, B., Blake, A., Fitzgibbon, A. W., Romdhani, S., and Vetter, T. 2007. Reconstructing high quality face-surfaces using model based stereo. In ICCV.Google Scholar
    4. Beeler, T., Bickel, B., Beardsley, P., Sumner, B., and Gross, M. 2010. High-quality single-shot capture of facial geometry. ACM Transactions on Graphics (TOG). Google ScholarDigital Library
    5. Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM Trans. Graph.. Google ScholarDigital Library
    6. Beeler, T., Bickel, B., Noris, G., Beardsley, P., Marschner, S., Sumner, R. W., and Gross, M. 2012. Coupled 3d reconstruction of sparse facial hair and skin. ACM Trans. Graph.. Google ScholarDigital Library
    7. Bérard, P., Bradley, D., Nitti, M., Beeler, T., and Gross, M. 2014. High-quality capture of eyes. ACM Trans. Graph. 33, 6 (Nov.), 223:1–223:12. Google ScholarDigital Library
    8. Bermano, A. H., Bradley, D., Beeler, T., Zünd, F., Nowrouzezahrai, D., Baran, I., Sorkine, O., Pfister, H., Sumner, R. W., Bickel, B., and Gross, M. 2014. Facial performance enhancement using dynamic shape space analysis. ACM Trans. Graph.. Google ScholarDigital Library
    9. Bickel, B., Lang, M., Botsch, M., Otaduy, M. A., and Gross, M. H. 2008. Pose-space animation and transfer of facial details. In Symposium on Computer Animation. Google ScholarDigital Library
    10. Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. Google ScholarDigital Library
    11. Botsch, M., Kobbelt, L., Pauly, M., Alliez, P., and Levy, B. 2010. Polygon Mesh Processing. AK Peters.Google Scholar
    12. Bouaziz, S., Wang, Y., and Pauly, M. 2013. Online modeling for realtime facial animation. ACM Trans. Graph.. Google ScholarDigital Library
    13. Bouaziz, S., Tagliasacchi, A., and Pauly, M. 2014. Dynamic 2d/3d registration. Eurographics Tutorial.Google Scholar
    14. Bunnell, M. 2005. Dynamic ambient occlusion and indirect lighting. Gpu gems.Google Scholar
    15. Cao, X., Wei, Y., Wen, F., and Sun, J. 2012. Face alignment by explicit shape regression. In CVPR. Google ScholarDigital Library
    16. Cao, C., Weng, Y., Lin, S., and Zhou, K. 2013. 3d shape regression for real-time facial animation. ACM Trans. Graph.. Google ScholarDigital Library
    17. Cao, C., Hou, Q., and Zhou, K. 2014. Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans. Graph.. Google ScholarDigital Library
    18. Cao, C., Weng, Y., Zhou, S., Tong, Y., and Zhou, K. 2014. Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics. Google ScholarDigital Library
    19. Chai, M., Zheng, C., and Zhou, K. 2014. A reduced model for interactive hairs. ACM Transactions on Graphics (July). Google ScholarDigital Library
    20. Chambolle, A., Caselles, V., Cremers, D., Novaga, M., and Pock, T. 2010. An introduction to total variation for image analysis. Theoretical foundations and numerical methods for sparse recovery 9, 263–340.Google Scholar
    21. Chartrand, R., and Yin, W. 2008. Iteratively reweighted algorithms for compressive sensing. In Acoustics, speech and signal processing, 2008. ICASSP 2008. IEEE international conference on, IEEE, 3869–3872.Google Scholar
    22. Duda, R. O., and Hart, P. E. 1972. Use of the hough transformation to detect lines and curves in pictures. Commun. ACM. Google ScholarDigital Library
    23. Frolova, D., Simakov, D., and Basri, R. 2004. Accuracy of spherical harmonic approximations for images of lambertian objects under far and near lighting. In Computer Vision-ECCV 2004.Google Scholar
    24. Fu, W. J. 1998. Penalized Regressions: The Bridge versus the Lasso. J. Comp. Graph. Stat..Google Scholar
    25. Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell.. Google ScholarDigital Library
    26. Garrido, P., Valgaerts, L., Wu, C., and Theobalt, C. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM Transactions on Graphics. Google ScholarDigital Library
    27. Ghosh, A., Fyffe, G., Tunwattanapong, B., Busch, J., Yu, X., and Debevec, P. 2011. Multiview face capture using polarized spherical gradient illumination. In Proc. of ACM SIGGRAPH Asia. Google ScholarDigital Library
    28. Gonzalez, R. C., and Woods, R. E. 2006. Digital Image Processing (3rd Edition). Prentice-Hall, Inc. Google ScholarDigital Library
    29. Gray, R. M. 2006. Toeplitz and circulant matrices: A review. now publishers Inc. Google ScholarDigital Library
    30. Hu, L., Ma, C., Luo, L., and Li, H. 2014. Robust hair capture using simulated examples. ACM Transactions on Graphics. Google ScholarDigital Library
    31. Huang, H., Chai, J., Tong, X., and Wu, H.-T. 2011. Leveraging motion capture and 3d scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. (Proc. SIGGRAPH). Google ScholarDigital Library
    32. Jimenez, J., Echevarria, J. I., Oat, C., and Gutierrez, D. 2011. GPU Pro 2. AK Peters Ltd., ch. Practical and Realistic Facial Wrinkles Animation.Google Scholar
    33. Kemelmacher-Shlizerman, I., and Basri, R. 2011. 3d face reconstruction from a single image using a single reference face shape. Pattern Analysis and Machine Intelligence, IEEE Transactions on. Google ScholarDigital Library
    34. Lewis, J. P., Anjyo, K., Rhee, T., Zhang, M., Pighin, F., and Deng, Z. 2014. Practice and Theory of Blendshape Facial Models. In EG – STARs.Google Scholar
    35. Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graph.. Google ScholarDigital Library
    36. Li, H., Yu, J., Ye, Y., and Bregler, C. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics. Google ScholarDigital Library
    37. Li, J., Xu, W., Cheng, Z., Xu, K., and Klein, R. 2015. Lightweight wrinkle synthesis for 3d facial modeling and animation. Computer-Aided Design 58, 0, 117–122. Solid and Physical Modeling 2014.Google ScholarDigital Library
    38. Ma, W.-C., Jones, A., Chiang, J.-Y., Hawkins, T., Frederiksen, S., Peers, P., Vukovic, M., Ouhyoung, M., and Debevec, P. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. Proc. of ACM SIGGRAPH Asia. Google ScholarDigital Library
    39. Oat, C. 2007. Animated wrinkle maps. In ACM SIGGRAPH 2007 courses. Google ScholarDigital Library
    40. Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graph.. Google ScholarDigital Library
    41. Saragih, J. M., Lucey, S., and Cohn, J. F. 2009. Face alignment through subspace constrained mean-shifts. In Computer Vision, 2009 IEEE 12th International Conference on.Google Scholar
    42. Saragih, J. M., Lucey, S., and Cohn, J. F. 2011. Deformable model fitting by regularized landmark mean-shift. Int. J. Comput. Vision. Google ScholarDigital Library
    43. Shi, F., Wu, H.-T., Tong, X., and Chai, J. 2014. Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans. Graph. 33, 6 (Nov.), 222:1–222:13. Google ScholarDigital Library
    44. Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. ACM Trans. Graph.. Google ScholarDigital Library
    45. Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.-P., and Theobalt, C. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. Proc. of ACM SIGGRAPH Asia.Google Scholar
    46. Venkataraman, K., Lodha, S., and Raghavan, R. 2005. A kinematic-variational model for animating skin with wrinkles. Computers & Graphics. Google ScholarDigital Library
    47. Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models.Google Scholar
    48. Weise, T., Li, H., Van Gool, L., and Pauly, M. 2009. Face/off: Live facial puppetry. ACM Trans. Graph..Google Scholar
    49. Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. In ACM SIGGRAPH 2011 Papers. Google ScholarDigital Library
    50. Wu, Y., Kalra, P., and Thalmann, N. M. 1996. Simulation of static and dynamic wrinkles of skin. In Proc. of IEEE Computer Animation. Google ScholarDigital Library
    51. Wu, C., Zollhöfer, M., Niessner, M., Stamminger, M., Izadi, S., and Theobalt, C. 2014. Real-time shading-based refinement for consumer depth cameras. ACM Trans. Graph. 33, 6 (Nov.), 200:1–200:10. Google ScholarDigital Library
    52. Wu, C. 2013. Towards linear-time incremental structure from motion. In 3D Vision, 2013 International Conference on. Google ScholarDigital Library
    53. Zach, C., Pock, T., and Bischof, H. 2007. A duality based approach for realtime tv-l 1 optical flow. In Pattern Recognition. Springer, 214–223. Google ScholarDigital Library
    54. Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: High-resolution capture for modeling and animation. In ACM Annual Conference on Computer Graphics.Google Scholar

ACM Digital Library Publication: