“Articulated mesh animation from multi-view silhouettes” by Vlasic∗, Baran, Matusik and Popović

  • ©Daniel Vlasic, Ilya Baran, Wojciech Matusik, and Jovan Popović




    Articulated mesh animation from multi-view silhouettes



    Details in mesh animations are difficult to generate but they have great impact on visual quality. In this work, we demonstrate a practical software system for capturing such details from multi-view video recordings. Given a stream of synchronized video images that record a human performance from multiple viewpoints and an articulated template of the performer, our system captures the motion of both the skeleton and the shape. The output mesh animation is enhanced with the details observed in the image silhouettes. For example, a performance in casual loose-fitting clothes will generate mesh animations with flowing garment motions. We accomplish this with a fast pose tracking method followed by nonrigid deformation of the template to fit the silhouettes. The entire process takes less than sixteen seconds per frame and requires no markers or texture cues. Captured meshes are in full correspondence making them readily usable for editing operations including texturing, deformation transfer, and deformation model learning.


    1. Alexa, M. 2003. Differential coordinates for local mesh morphing and deformation. The Visual Computer 19, 2, 105–114.Google ScholarCross Ref
    2. Allen, B., Curless, B., and Popović, Z. 2002. Articulated body deformation from range scan data. ACM Transactions on Graphics 21, 3 (July), 612–619. Google ScholarDigital Library
    3. Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., and Davis, J. 2005. Scape: Shape Completion and Animation of People. ACM Transactions on Graphics 24, 3 (Aug.), 408–416. Google ScholarDigital Library
    4. Anuar, N., and Guskov, I. 2004. Extracting animated meshes with adaptive motion estimation. In Workshop on Vision, Modeling, and Visualization, 63–71.Google Scholar
    5. Balan, A. O., Sigal, L., Black, M. J., Davis, J. E., and Haussecker, H. W. 2007. Detailed human shape and pose from images. In Computer Vision and Pattern Recognition.Google Scholar
    6. Baran, I., and Popović, J. 2007. Automatic rigging and animation of 3D characters. ACM Transactions on Graphics 26, 3 (July), 72:1–72:8. Google ScholarDigital Library
    7. Carranza, J., Theobalt, C., Magnor, M. A., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. ACM Transactions on Graphics 22, 3 (July), 569–577. Google ScholarDigital Library
    8. Cheung, K. M., Baker, S., and Kanade, T. 2005. Shape—from—silhouette across time Part II: Applications to human modeling and markerless motion tracking. International Journal of Computer Vision 63, 3 (July), 225–245. Google ScholarDigital Library
    9. Corazza, S., Mündermann, L., Chaudhari, A., Demattio, T., Cobelli, C., and Andriacchi, T. P. 2006. A markerless motion capture system to study musculoskeletal biomechanics: Visual hull and simulated annealing approach. Annals of Biomedical Engineering 34, 6 (July), 1019–1029.Google ScholarCross Ref
    10. Danielsson, P.-E. 1980. Euclidean distance mapping. Computer Graphics and Image Processing 14, 227–248.Google ScholarCross Ref
    11. De Aguiar, E., Theobalt, C., Magnor, M., Theisel, H., and Seidel, H.-P. 2004. M3: marker-free model reconstruction and motion tracking from 3D voxel data. In Pacific Conference on Computer Graphics and Applications, 101–110. Google ScholarDigital Library
    12. De Aguiar, E., Theobalt, C., Magnor, M., and Seidel, H.-P. 2005. Reconstructing human shape and motion from multi-view video. In Conference on Visual Media Production, 44–51.Google Scholar
    13. De Aguiar, E., Theobalt, C., Stoll, C., and Seidel, H.-P. 2007. Marker-less deformable mesh tracking for human shape and motion capture. In Computer Vision and Pattern Recognition.Google Scholar
    14. De Aguiar, E., Theobalt, C., Stoll, C., and Seidel, H.-P. 2007. Rapid animation of laser-scanned humans. In Virtual Reality, 223–226.Google Scholar
    15. Erol, A., Bebis, G., Boyle, R. D., and Nicolescu, M. 2005. Visual hull construction using adaptive sampling. In Workshop on Applications of Computer Vision, 234–241. Google ScholarDigital Library
    16. Esteban, C. H., and Schmitt, F. 2004. Silhouette and stereo fusion for 3D object modeling. Computer Vision and Image Understanding 96, 3 (Dec.), 367–392. Google ScholarDigital Library
    17. Furukawa, Y., and Ponce, J. 2006. Carved visual hulls for image-based modeling. In European Conference on Computer Vision, 564–577. Google ScholarDigital Library
    18. Gill, P. E., Murray, W., and Saunders, M. A. 2002. SNOPT: An SQP algorithm for large-scale constrained optimization. SIAM Journal on Optimization 12, 4, 979–1006. Google ScholarDigital Library
    19. Goldlücke, B., Ihrke, I., Linz, C., and Magnor, M. 2007. Weighted minimal hypersurface reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 7 (July), 1194–1208. Google ScholarDigital Library
    20. Hornung, A., and Kobbelt, L. 2006. Hierarchical volumetric multi-view stereo reconstruction of manifold surfaces based on dual graph embedding. In Computer Vision and Pattern Recognition, 503–510. Google ScholarDigital Library
    21. Kircher, S., and Garland, M. 2006. Editing arbitrarily deforming surface animations. ACM Transactions on Graphics 25, 3 (July), 1098–1107. Google ScholarDigital Library
    22. Levoy, M., Pulli, K., Curless, B., Rusinkiewicz, S., Koller, D., Pereira, L., Ginzton, M., Anderson, S., Davis, J., Ginsberg, J., Shade, J., and Fulk, D. 2000. The digital michelangelo project: 3D scanning of large statues. In Proceedings of ACM SIGGRAPH 2000, Annual Conference Series, 131–144. Google ScholarDigital Library
    23. Lipman, Y., Sorkine, O., Cohen-Or, D., Levin, D., Rössl, C., and Seidel, H.-P. 2004. Differential coordinates for interactive mesh editing. In Shape Modeling International, 181–190. Google ScholarCross Ref
    24. Ménier, C., Boyer, E., and Raffin, B. 2006. 3D skeleton-based body pose recovery. In 3D Data Processing, Visualization and Transmission, 389–396. Google ScholarDigital Library
    25. Meyer, M., Desbrun, M., Schröder, P., and Barr, A. H. 2003. Discrete differential-geometry operators for triangulated 2-manifolds. In Visualization and Mathematics III. 35–57.Google Scholar
    26. Park, S. I., and Hodgins, J. K. 2006. Capturing and animating skin deformation in human motion. ACM Transactions on Graphics 25, 3 (July), 881–889. Google ScholarDigital Library
    27. Plänkers, R., and Fua, P. 2001. Articulated soft objects for video-based body modeling. In International Conference on Computer Vision, 394–401.Google Scholar
    28. Rander, P. W., Narayanan, P., and Kanade, T. 1997. Virtualized reality: Constructing time-varying virtual worlds from real world events. In IEEE Visualization, 277–284. Google ScholarDigital Library
    29. Sand, P., McMillan, L., and Popović, J. 2003. Continuous capture of skin deformation. ACM Transactions on Graphics 22, 3 (July), 578–586. Google ScholarDigital Library
    30. Scholz, V., Stich, T., Keckeisen, M., Wacker, M., and Magnor, M. 2005. Garment motion capture using color-coded patterns. Computer Graphics Forum 24, 3 (Aug.), 439–448.Google ScholarCross Ref
    31. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Computer Vision and Pattern Recognition, 519–528. Google ScholarDigital Library
    32. Starck, J., and Hilton, A. 2007. Surface capture for performance based animation. IEEE Computer Graphics and Applications 27(3), 21–31. Google ScholarDigital Library
    33. Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. ACM Transactions on Graphics 23, 3 (Aug.), 399–405. Google ScholarDigital Library
    34. Svoboda, T., Martinec, D., and Pajdla, T. 2005. A convenient multi-camera self-calibration for virtual environments. PRESENCE: Teleoperators and Virtual Environments 14, 4 (August), 407–422. Google ScholarDigital Library
    35. Theobalt, C., Magnor, M., Schuler, P., and Seidel, H.-P. 2002. Combining 2d feature tracking and volume reconstruction for online video-based human motion capture. In Pacific Conference on Computer Graphics and Applications, 96–103. Google ScholarDigital Library
    36. Theobalt, C., Ahmed, N., Lensch, H., Magnor, M., and Seidel, H.-P. 2007. Seeing people in different light-joint shape, motion, and reflectance capture. IEEE Transactions on Visualization and Computer Graphics 13, 4 (July/Aug.), 663–674. Google ScholarDigital Library
    37. Wang, R. Y., Pulli, K., and Popović, J. 2007. Real-time enveloping with rotational regression. ACM Transactions on Graphics 26, 3 (July), 73:1–73:9. Google ScholarDigital Library
    38. White, R., Crane, K., and Forsyth, D. A. 2007. Capturing and animating occluded cloth. ACM Transactions on Graphics 26, 3 (July), 34:1–34:8. Google ScholarDigital Library

ACM Digital Library Publication: