“Free-viewpoint video of human actors” by Carranza, Theobalt, Magnor and Seidel

  • ©Joel Carranza, Christian Theobalt, Marcus Magnor, and Hans-Peter Seidel




    Free-viewpoint video of human actors



    In free-viewpoint video, the viewer can interactively choose his viewpoint in 3-D space to observe the action of a dynamic real-world scene from arbitrary perspectives. The human body and its motion plays a central role in most visual media and its structure can be exploited for robust motion estimation and efficient visualization. This paper describes a system that uses multi-view synchronized video footage of an actor’s performance to estimate motion parameters and to interactively re-render the actor’s appearance from any viewpoint.The actor’s silhouettes are extracted from synchronized video frames via background segmentation and then used to determine a sequence of poses for a 3D human body model. By employing multi-view texturing during rendering, time-dependent changes in the body surface are reproduced in high detail. The motion capture subsystem runs offline, is non-intrusive, yields robust motion parameter estimates, and can cope with a broad range of motion. The rendering subsystem runs at real-time frame rates using ubiquous graphics hardware, yielding a highly naturalistic impression of the actor. The actor can be placed in virtual environments to create composite dynamic scenes. Free-viewpoint video allows the creation of camera fly-throughs or viewing the action interactively from arbitrary perspectives.


    1. ALLEN, B., CURLESS, B., AND POPOVIC, Z. 2002. Articulated body deformations from range scan data. In Proceedings of ACM SIGGRAPH 02, 612–619.]] Google Scholar
    2. BOROVIKOV, E., AND DAVIS, L. 2000. A dristibuted system for real-time volume reconstruction. In Proceedings of Intl. Workshop on Computer Architectures for Machine Perception, 183ff.]] Google ScholarCross Ref
    3. BOTTINO, A., AND LAURENTINI, A. 2001. A silhouette based technique for the reconstruction of human movement. CVIU 83, 79–95.]] Google ScholarDigital Library
    4. BREGLER, C., AND MALIK, J. 1998. Tracking people with twists and exponential maps. In Proc. of CVPR 98, 8–15.]] Google ScholarDigital Library
    5. BUEHLER, C., BOSSE, M., MCMILLAN, L., GORTLER, S. J., AND COHEN, M. F. 2001. Unstructured lumigraph rendering. In Proceedings of ACM SIGGRAPH 01, ACM Press, S. Spencer, Ed., 425–432.]] Google Scholar
    6. CHEUNG, K., KANADE, T., BOUGUET, J.-Y., AND HOLLER, M. 2000. A real time system for robust 3D voxel reconstruction of human motions In Proc. of CVPR, vol. 2, 714–720.]]Google Scholar
    7. CURLESS, B., AND SEITZ, S. 2000. 3D photography Course Notes. ACM SIGGRAPH 00.]]Google Scholar
    8. DANA, K., VAN GINNEKEN, B., NAYAR, S., AND KOENDERINK, J. 1999 Reflectance and texture of real-world surfaces. ACM Transactions on Graphics 18, 1, 1–34.]] Google ScholarDigital Library
    9. DEBEVEC, P., TAYLOR, C., MALIK, J., LEVIN, G., G. BORSHUKOV, AND YU, Y. 1998. Image-based modeling and rendering of architecture with interactive photogrammetry and view-dependent texture mapping. Proc. IEEE International Symposium on Circuits and Systems (ISCAS’98), Monterey, USA 5 (June), 514–517.]]Google Scholar
    10. DELAMARRE, Q., AND FAUGERAS, O. 1999. 3D articulated models and multi-view tracking with silhouettes. In Proc. of ICCV 99, 716–721.]] Google ScholarDigital Library
    11. GAVRILA, D., AND DAVIS, L. 1996. 3D model-based tracking of humans in action: A multi-view approach. In Proc. of CVPR 96, 73–80.]] Google ScholarDigital Library
    12. GAVRILA, D. 1999. The visual analysis of human movement. CVIU 73, 1 (January), 82–98.]] Google ScholarDigital Library
    13. GRAMMALIDIS, N., GOUSSIS, G., TROUFAKOS, G., AND STRINTZIS, M. 2001. Estimating body animation parameters from depth images using analysis by synthesis. In Proc. of Second International Workshop on Digital and Computational Video (DCV’01), 93ff.]] Google Scholar
    14. JAIN, R., KASTURI, R., AND SCHUNCK, B. 1995. Machine Vision. McGraw-Hill.]] Google Scholar
    15. KILGARD, M. J., 2002. Nvidia opengl extension specifications. http://developer.nvidia.com/docs/IO/3260/ATT/nv30specs.pdf.]]Google Scholar
    16. KOENEN, R., 2002. Mpeg-4 overview. http://mpeg.telecomitalialab.com/standards/mpeg-4/mpeg-4.htm.]]Google Scholar
    17. LAURENTINI, A. 1994. The visual hull concept for silhouette-based image understanding. Pattern Analysis and Machine Intelligence 16, 2 (February), 150–162.]] Google ScholarDigital Library
    18. LENSCH, H., HEIDRICH, W., AND SEIDEL, H. P. 2001. A silhouette-based algorithm for texture registration and stitching. Graphical Models 64(3), 245–262.]] Google ScholarDigital Library
    19. LEUNG, M., AND YANG, Y. 1995. First sight: A human body outline labeling system. PAMI 17(4), 359–379.]] Google ScholarDigital Library
    20. LEVOY, M., AND HANRAHAN, P. 1996. Light field rendering. In Proceedings of ACM SIGGRAPH 96, vol. 30, 31–42.]] Google Scholar
    21. LUCK, J., AND SMALL, D. 2002. Real-time markerless motion tracking using linked kinematic chains. In Proc. of CVPRIP02.]]Google Scholar
    22. MARTINEZ, G. 1995. 3D motion estimation of articulated objects for object-based analysis-synthesis coding (OBASC). In VLBV 95.]]Google Scholar
    23. MATSUYAMA, T., AND TAKAI, T. 2002. Generation, visualization, and editing of 3D video. In Proc. of 1st International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02), 234ff.]]Google Scholar
    24. MATUSIK, W., BUEHLER, C., RASKAR, R., GORTLER, S., AND MCMILLAN, L. 2000. Image-based visual hulls. In Proceedings of ACM SIGGRAPH 00, 369–374.]] Google Scholar
    25. MATUSIK, W., BUEHLER, C., AND MCMILLAN, L. 2001. Polyhedral visual hulls for real-time rendering. In Proceedings of 12th Eurographics Workshop on Rendering, 116–126.]] Google Scholar
    26. MENACHE, A. 1995. Understanding Motion Capture for Computer Animation and Video Games. Morgan Kaufmann.]] Google Scholar
    27. MIKIĆ, I., TRIVERDI, M., HUNTER, E., AND COSMAN, P. 2001. Articulated body posture estimation from multicamera voxel data. In Proc. of CVPR.]]Google Scholar
    28. MOEZZI, S., TAI, L.-C., AND GERARD, P. 1997. Virtual view generation for 3D digital video. IEEE MultiMedia 4, 1 (Jan.–Mar.), 18–26.]] Google ScholarDigital Library
    29. MULLIGAN, J., AND DANIILIDIS, K. 2000. View-independent scene acquisition for telepresence. In Proceedings of the International Symposium on Augmented Reality, 105–108.]]Google Scholar
    30. NARAYANAN, P., RANDER, P., AND KANADE, T. 1998. Constructing virtual worlds using dense stereo. In Proc. of ICCV 98, 3–10.]] Google ScholarDigital Library
    31. PLAENKERS, R., AND FUA, P. 2001. Tracking and modeling people in video sequences. CVIU 81, 3 (March), 285–302.]] Google Scholar
    32. PRESS, W., TEUKOLSKY, S., VETTERLING, W., AND FLANNERY, B. 1992. Numerical Recipes. Cambridge University Press.]]Google Scholar
    33. RASKAR, R., AND LOW, K.-L. 2002. Blending multiple views. In Proceedings of Pacific Graphics 2002, 145–153.]] Google ScholarDigital Library
    34. ROHR, K. 1993. Incremental recognition of pedestrians from image sequences. In Proc. of CVPR 93, 8–13.]]Google Scholar
    35. SILAGHI, M.-C., PLAENKERS, R., BOULIC, R., FUA, P., AND THALMANN, D. 1998. Local and global skeleton fitting techniques for optical motion capture. In Modeling and Motion Capture Techniques for Virtual Environments, Springer, no. 1537 in LNAI, No1537, 26–40.]] Google ScholarDigital Library
    36. TERZOPOULOS, D., CARLBOM, I., FREEMAN, W., KLINKER, G., LORENSEN, W., SZELISKI, R., AND WATERS, K. 1995. Computer vision for computer graphics. In ACM SIGGRAPH 95 Course Notes, vol. 25.]]Google Scholar
    37. THEOBALT, C., MAGNOR, M., SCHUELER, P., AND SEIDEL, H.-P. 2002. Combining 2D feature tracking and volume reconstruction for online video-based human motion capture. In Proceedings of Pacific Graphics 2002, 96–103.]] Google ScholarDigital Library
    38. TSAI, R. 1986. An efficient and accurate camera calibration technique for 3D machine vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’86), 364–374.]]Google Scholar
    39. VEDULA, S., BAKER, S., AND KANADE, T. 2002. Spatio-temporal view interpolation. In Proceedings of the 13th ACM Eurographics Workshop on Rendering, 65–75.]] Google Scholar
    40. WREN, C., AZARBAYEJANI, A., DARRELL, T., AND PENTLAND, A. 1997. Pfinder: Real-time tracking of the human body. PAMI 19, 7, 780–785.]] Google ScholarDigital Library
    41. WUERMLIN, S., LAMBORAY, E., STAADT, O., AND GROSS, M. 2002. 3d video recorder. In Proceedings of Pacific Graphics 2002, IEEE Computer Society Press, 325–334.]] Google Scholar
    42. YONEMOTO, S., ARITA, D., AND TANIGUCHI, R. 2000. Real-time human motion analysis and IK-based human figure control. In Proceedings of IEEE Workshop on Human Motion, 149–154.]] Google ScholarCross Ref

ACM Digital Library Publication: