“Performance capture from sparse multi-view video” by de Aguiar, Stoll, Theobalt, Ahmed, Seidel, et al. …

  • ©Edilson de Aguiar, Carsten Stoll, Christian Theobalt, Naveed Ahmed, Hans-Peter Seidel, and Sebastian Thrun




    Performance capture from sparse multi-view video



    This paper proposes a new marker-less approach to capturing human performances from multi-view video. Our algorithm can jointly reconstruct spatio-temporally coherent geometry, motion and textural surface appearance of actors that perform complex and rapid moves. Furthermore, since our algorithm is purely meshbased and makes as few as possible prior assumptions about the type of subject being tracked, it can even capture performances of people wearing wide apparel, such as a dancer wearing a skirt. To serve this purpose our method efficiently and effectively combines the power of surface- and volume-based shape deformation techniques with a new mesh-based analysis-through-synthesis framework. This framework extracts motion constraints from video and makes the laser-scan of the tracked subject mimic the recorded performance. Also small-scale time-varying shape detail is recovered by applying model-guided multi-view stereo to refine the model surface. Our method delivers captured performance data at high level of detail, is highly versatile, and is applicable to many complex types of scenes that could not be handled by alternative marker-based or marker-free recording techniques.


    1. Allen, B., Curless, B., and Popović, Z. 2002. Articulated body deformation from range scan data. ACM Trans. Graph. 21, 3, 612–619. Google ScholarDigital Library
    2. Balan, A. O., Sigal, L., Black, M. J., Davis, J. E., and Haussecker, H. W. 2007. Detailed human shape and pose from images. In Proc. CVPR.Google Scholar
    3. Bickel, B., Botsch, M., Angst, R., Matusik, W., Otaduy, M., Pfister, H., and Gross, M. 2007. Multi-scale capture of facial geometry and motion. In Proc. of SIGGRAPH, 33. Google ScholarDigital Library
    4. Botsch, M., and Sorkine, O. 2008. On linear variational surface deformation methods. IEEE TVCG 14, 1, 213–230. Google ScholarDigital Library
    5. Botsch, M., Pauly, M., Wicke, M., and Gross, M. 2007. Adaptive space deformations based on rigid cells. Computer Graphics Forum 26, 3, 339–347.Google ScholarCross Ref
    6. Byrd, R., Lu, P., Nocedal, J., and Zhu, C. 1995. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comp. 16, 5, 1190–1208. Google ScholarDigital Library
    7. Carranza, J., Theobalt, C., Magnor, M., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. In Proc. SIGGRAPH, 569–577. Google ScholarDigital Library
    8. de Aguiar, E., Theobalt, C., Stoll, C., and Seidel, H.-P. 2007. Marker-less deformable mesh tracking for human shape and motion capture. In Proc. CVPR, IEEE, 1–8.Google Scholar
    9. de Aguiar, E., Theobalt, C., Stoll, C., and Seidel, H. 2007. Marker-less 3d feature tracking for mesh-based human motion capture. In Proc. ICCV HUMO07, 1–15. Google ScholarDigital Library
    10. de Aguiar, E., Theobalt, C., Thrun, S., and Seidel, H.-P. 2008. Automatic conversion of mesh animations into skeleton-based animations. Computer Graphics Forum (Proc. Eurographics EG’08) 27, 2 (4), 389–397.Google Scholar
    11. Einarsson, P., Chabert, C.-F., Jones, A., Ma, W.-C., Lamond, B., im Hawkins, Bolas, M., Sylwan, S., and Debevec, P. 2006. Relighting human locomotion with flowed reflectance fields. In Proc. EGSR, 183–194. Google ScholarCross Ref
    12. Goesele, M., Curless, B., and Seitz, S. M. 2006. Multiview stereo revisited. In Proc. CVPR, 2402–2409. Google ScholarDigital Library
    13. Gross, M., Würmlin, S., Näf, M., Lamboray, E., Spagno, C., Kunz, A., Koller-Meier, E., Svoboda, T., Gool, L. V., Lang, S., Strehlke, K., Moere, A. V., and Staadt, O. 2003. blue-c: a spatially immersive display and 3d video portal for telepresence. ACM TOG 22, 3, 819–827. Google ScholarDigital Library
    14. Kanade, T., Rander, P., and Narayanan, P. J. 1997. Virtualized reality: Constructing virtual worlds from real scenes. IEEE MultiMedia 4, 1, 34–47. Google ScholarDigital Library
    15. Kazhdan, M., Bolitho, M., and Hoppe, H. 2006. Poisson surface reconstruction. In Proc. SGP, 61–70. Google ScholarDigital Library
    16. Leordeanu, M., and Hebert, M. 2005. A spectral technique for correspondence problems using pairwise constraints. In Proc. ICCV. Google ScholarDigital Library
    17. Lowe, D. G. 1999. Object recognition from local scale-invariant features. In Proc. ICCV, vol. 2, 1150ff. Google ScholarDigital Library
    18. Matusik, W., Buehler, C., Raskar, R., Gortler, S., and McMillan, L. 2000. Image-based visual hulls. In Proc. SIGGRAPH, 369–374. Google ScholarDigital Library
    19. Menache, A., and Manache, A. 1999. Understanding Motion Capture for Computer Animation and Video Games. Morgan Kaufmann. Google ScholarDigital Library
    20. Mitra, N. J., Flory, S., Ovsjanikov, M., Gelfand, N., as, L. G., and Pottmann, H. 2007. Dynamic geometry registration. In Proc. SGP, 173–182. Google ScholarDigital Library
    21. Moeslund, T. B., Hilton, A., and Krüger, V. 2006. A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104, 2, 90–126. Google ScholarDigital Library
    22. Müller, M., Dorsey, J., McMillan, L., Jagnow, R., and Cutler, B. 2002. Stable real-time deformations. In Proc. of SCA, ACM, 49–54. Google ScholarDigital Library
    23. Paramount, 2007. Beowulf movie page. http://www.beowulfmovie.com/.Google Scholar
    24. Park, S. I., and Hodgins, J. K. 2006. Capturing and animating skin deformation in human motion. ACM TOG (SIGGRAPH 2006) 25, 3 (Aug.). Google ScholarDigital Library
    25. Poppe, R. 2007. Vision-based human motion analysis: An overview. CVIU 108, 1. Google ScholarDigital Library
    26. Rosenhahn, B., Kersting, U., Powel, K., and Seidel, H.-P. 2006. Cloth x-ray: Mocap of people wearing textiles. In LNCS 4174: Proc. DAGM, 495–504. Google ScholarDigital Library
    27. Sand, P., McMillan, L., and Popović, J. 2003. Continuous capture of skin deformation. ACM TOG 22, 3. Google ScholarDigital Library
    28. Scholz, V., Stich, T., Keckeisen, M., Wacker, M., and Magnor, M. 2005. Garment motion capture using colorcoded patterns. Computer Graphics Forum (Proc. Eurographics EG’05) 24, 3 (Aug.), 439–448. Google ScholarDigital Library
    29. Shinya, M. 2004. Unifying measured point sequences of deforming objects. In Proc. of 3DPVT, 904–911. Google ScholarCross Ref
    30. Sorkine, O., and Alexa, M. 2007. As-rigid-as-possible surface modeling. In Proc. SGP, 109–116. Google ScholarDigital Library
    31. Starck, J., and Hilton, A. 2007. Surface capture for performance based animation. IEEE CGAA 27(3), 21–31. Google ScholarDigital Library
    32. Stoll, C., Karni, Z., Rössl, C., Yamauchi, H., and Seidel, H.-P. 2006. Template deformation for point cloud fitting. In Proc. SGP, 27–35. Google ScholarCross Ref
    33. Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. In SIGGRAPH ’04, 399–405. Google ScholarDigital Library
    34. Vedula, S., Baker, S., and Kanade, T. 2005. Image-based spatio-temporal modeling and view interpolation of dynamic events. ACM Trans. Graph. 24, 2, 240–261. Google ScholarDigital Library
    35. Wand, M., Jenke, P., Huang, Q., Bokeloh, M., Guibas, L., and Schilling, A. 2007. Reconstruction of deforming geometry from time-varying point clouds. In Proc. SGP, 49–58. Google ScholarDigital Library
    36. Waschbüsch, M., Würmlin, S., Cotting, D., Sadlo, F., and Gross, M. 2005. Scalable 3D video of dynamic scenes. In Proc. Pacific Graphics, 629–638.Google Scholar
    37. White, R., Crane, K., and Forsyth, D. 2007. Capturing and animating occluded cloth. In ACM TOG (Proc. SIGGRAPH). Google ScholarDigital Library
    38. Wilburn, B., Joshi, N., Vaish, V., Talvala, E., Antunez, E., Barth, A., Adams, A., Horowitz, M., and Levoy, M. 2005. High performance imaging using large camera arrays. ACM TOG 24, 3, 765–776. Google ScholarDigital Library
    39. Xu, W., Zhou, K., Yu, Y., Tan, Q., Peng, Q., and Guo, B. 2007. Gradient domain editing of deforming mesh sequences. In Proc. SIGGRAPH, ACM, 84ff. Google ScholarDigital Library
    40. Yamauchi, H., Gumhold, S., Zayer, R., and Seidel, H.-P. 2005. Mesh segmentation driven by gaussian curvature. Visual Computer 21, 8–10, 649–658.Google ScholarCross Ref
    41. Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM TOG 23, 3, 600–608. Google ScholarDigital Library

ACM Digital Library Publication: