Performance capture from sparse multi-view video

This paper proposes a new marker-less approach to capturing human performances from multi-view video. Our algorithm can jointly reconstruct spatio-temporally coherent geometry, motion and textural surface appearance of actors that perform complex and rapid moves. Furthermore, since our algorithm is purely meshbased and makes as few as possible prior assumptions about the type of subject being tracked, it can even capture performances of people wearing wide apparel, such as a dancer wearing a skirt. To serve this purpose our method efficiently and effectively combines the power of surface- and volume-based shape deformation techniques with a new mesh-based analysis-through-synthesis framework. This framework extracts motion constraints from video and makes the laser-scan of the tracked subject mimic the recorded performance. Also small-scale time-varying shape detail is recovered by applying model-guided multi-view stereo to refine the model surface. Our method delivers captured performance data at high level of detail, is highly versatile, and is applicable to many complex types of scenes that could not be handled by alternative marker-based or marker-free recording techniques.

References:

1. Allen, B., Curless, B., and Popović, Z. 2002. Articulated body deformation from range scan data. ACM Trans. Graph. 21, 3, 612–619. Google ScholarDigital Library
2. Balan, A. O., Sigal, L., Black, M. J., Davis, J. E., and Haussecker, H. W. 2007. Detailed human shape and pose from images. In Proc. CVPR.Google Scholar
3. Bickel, B., Botsch, M., Angst, R., Matusik, W., Otaduy, M., Pfister, H., and Gross, M. 2007. Multi-scale capture of facial geometry and motion. In Proc. of SIGGRAPH, 33. Google ScholarDigital Library
4. Botsch, M., and Sorkine, O. 2008. On linear variational surface deformation methods. IEEE TVCG 14, 1, 213–230. Google ScholarDigital Library
5. Botsch, M., Pauly, M., Wicke, M., and Gross, M. 2007. Adaptive space deformations based on rigid cells. Computer Graphics Forum 26, 3, 339–347.Google ScholarCross Ref
6. Byrd, R., Lu, P., Nocedal, J., and Zhu, C. 1995. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comp. 16, 5, 1190–1208. Google ScholarDigital Library
7. Carranza, J., Theobalt, C., Magnor, M., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. In Proc. SIGGRAPH, 569–577. Google ScholarDigital Library
8. de Aguiar, E., Theobalt, C., Stoll, C., and Seidel, H.-P. 2007. Marker-less deformable mesh tracking for human shape and motion capture. In Proc. CVPR, IEEE, 1–8.Google Scholar
9. de Aguiar, E., Theobalt, C., Stoll, C., and Seidel, H. 2007. Marker-less 3d feature tracking for mesh-based human motion capture. In Proc. ICCV HUMO07, 1–15. Google ScholarDigital Library
10. de Aguiar, E., Theobalt, C., Thrun, S., and Seidel, H.-P. 2008. Automatic conversion of mesh animations into skeleton-based animations. Computer Graphics Forum (Proc. Eurographics EG’08) 27, 2 (4), 389–397.Google Scholar
11. Einarsson, P., Chabert, C.-F., Jones, A., Ma, W.-C., Lamond, B., im Hawkins, Bolas, M., Sylwan, S., and Debevec, P. 2006. Relighting human locomotion with flowed reflectance fields. In Proc. EGSR, 183–194. Google ScholarCross Ref
12. Goesele, M., Curless, B., and Seitz, S. M. 2006. Multiview stereo revisited. In Proc. CVPR, 2402–2409. Google ScholarDigital Library
13. Gross, M., Würmlin, S., Näf, M., Lamboray, E., Spagno, C., Kunz, A., Koller-Meier, E., Svoboda, T., Gool, L. V., Lang, S., Strehlke, K., Moere, A. V., and Staadt, O. 2003. blue-c: a spatially immersive display and 3d video portal for telepresence. ACM TOG 22, 3, 819–827. Google ScholarDigital Library
14. Kanade, T., Rander, P., and Narayanan, P. J. 1997. Virtualized reality: Constructing virtual worlds from real scenes. IEEE MultiMedia 4, 1, 34–47. Google ScholarDigital Library
15. Kazhdan, M., Bolitho, M., and Hoppe, H. 2006. Poisson surface reconstruction. In Proc. SGP, 61–70. Google ScholarDigital Library
16. Leordeanu, M., and Hebert, M. 2005. A spectral technique for correspondence problems using pairwise constraints. In Proc. ICCV. Google ScholarDigital Library
17. Lowe, D. G. 1999. Object recognition from local scale-invariant features. In Proc. ICCV, vol. 2, 1150ff. Google ScholarDigital Library
18. Matusik, W., Buehler, C., Raskar, R., Gortler, S., and McMillan, L. 2000. Image-based visual hulls. In Proc. SIGGRAPH, 369–374. Google ScholarDigital Library
19. Menache, A., and Manache, A. 1999. Understanding Motion Capture for Computer Animation and Video Games. Morgan Kaufmann. Google ScholarDigital Library
20. Mitra, N. J., Flory, S., Ovsjanikov, M., Gelfand, N., as, L. G., and Pottmann, H. 2007. Dynamic geometry registration. In Proc. SGP, 173–182. Google ScholarDigital Library
21. Moeslund, T. B., Hilton, A., and Krüger, V. 2006. A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104, 2, 90–126. Google ScholarDigital Library
22. Müller, M., Dorsey, J., McMillan, L., Jagnow, R., and Cutler, B. 2002. Stable real-time deformations. In Proc. of SCA, ACM, 49–54. Google ScholarDigital Library
23. Paramount, 2007. Beowulf movie page. http://www.beowulfmovie.com/.Google Scholar
24. Park, S. I., and Hodgins, J. K. 2006. Capturing and animating skin deformation in human motion. ACM TOG (SIGGRAPH 2006) 25, 3 (Aug.). Google ScholarDigital Library
25. Poppe, R. 2007. Vision-based human motion analysis: An overview. CVIU 108, 1. Google ScholarDigital Library
26. Rosenhahn, B., Kersting, U., Powel, K., and Seidel, H.-P. 2006. Cloth x-ray: Mocap of people wearing textiles. In LNCS 4174: Proc. DAGM, 495–504. Google ScholarDigital Library
27. Sand, P., McMillan, L., and Popović, J. 2003. Continuous capture of skin deformation. ACM TOG 22, 3. Google ScholarDigital Library
28. Scholz, V., Stich, T., Keckeisen, M., Wacker, M., and Magnor, M. 2005. Garment motion capture using colorcoded patterns. Computer Graphics Forum (Proc. Eurographics EG’05) 24, 3 (Aug.), 439–448. Google ScholarDigital Library
29. Shinya, M. 2004. Unifying measured point sequences of deforming objects. In Proc. of 3DPVT, 904–911. Google ScholarCross Ref
30. Sorkine, O., and Alexa, M. 2007. As-rigid-as-possible surface modeling. In Proc. SGP, 109–116. Google ScholarDigital Library
31. Starck, J., and Hilton, A. 2007. Surface capture for performance based animation. IEEE CGAA 27(3), 21–31. Google ScholarDigital Library
32. Stoll, C., Karni, Z., Rössl, C., Yamauchi, H., and Seidel, H.-P. 2006. Template deformation for point cloud fitting. In Proc. SGP, 27–35. Google ScholarCross Ref
33. Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. In SIGGRAPH ’04, 399–405. Google ScholarDigital Library
34. Vedula, S., Baker, S., and Kanade, T. 2005. Image-based spatio-temporal modeling and view interpolation of dynamic events. ACM Trans. Graph. 24, 2, 240–261. Google ScholarDigital Library
35. Wand, M., Jenke, P., Huang, Q., Bokeloh, M., Guibas, L., and Schilling, A. 2007. Reconstruction of deforming geometry from time-varying point clouds. In Proc. SGP, 49–58. Google ScholarDigital Library
36. Waschbüsch, M., Würmlin, S., Cotting, D., Sadlo, F., and Gross, M. 2005. Scalable 3D video of dynamic scenes. In Proc. Pacific Graphics, 629–638.Google Scholar
37. White, R., Crane, K., and Forsyth, D. 2007. Capturing and animating occluded cloth. In ACM TOG (Proc. SIGGRAPH). Google ScholarDigital Library
38. Wilburn, B., Joshi, N., Vaish, V., Talvala, E., Antunez, E., Barth, A., Adams, A., Horowitz, M., and Levoy, M. 2005. High performance imaging using large camera arrays. ACM TOG 24, 3, 765–776. Google ScholarDigital Library
39. Xu, W., Zhou, K., Yu, Y., Tan, Q., Peng, Q., and Guo, B. 2007. Gradient domain editing of deforming mesh sequences. In Proc. SIGGRAPH, ACM, 84ff. Google ScholarDigital Library
40. Yamauchi, H., Gumhold, S., Zayer, R., and Seidel, H.-P. 2005. Mesh segmentation driven by gaussian curvature. Visual Computer 21, 8–10, 649–658.Google ScholarCross Ref
41. Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM TOG 23, 3, 600–608. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2008: Technical Papers

“Performance capture from sparse multi-view video” by de Aguiar, Stoll, Theobalt, Ahmed, Seidel, et al. …

Conference:

Type(s):

Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: