VideoMocap: modeling physically realistic human motion from monocular video sequences

Xiaolin Wei; Jinxiang Chai

“VideoMocap: modeling physically realistic human motion from monocular video sequences” by Wei and Chai

Next: “Videoscapes: exploring sparse, unstructured... »

« Previous: “VideoFOCUS and VideoFOCUSWire: Transforming...

Conference:

SIGGRAPH 2010

Type(s):

Technical Papers

Title:

VideoMocap: modeling physically realistic human motion from monocular video sequences

Presenter(s)/Author(s):

Xiaolin Wei

Jinxiang Chai

Abstract:

This paper presents a video-based motion modeling technique for capturing physically realistic human motion from monocular video sequences. We formulate the video-based motion modeling process in an image-based keyframe animation framework. The system first computes camera parameters, human skeletal size, and a small number of 3D key poses from video and then uses 2D image measurements at intermediate frames to automatically calculate the “in between” poses. During reconstruction, we leverage Newtonian physics, contact constraints, and 2D image measurements to simultaneously reconstruct full-body poses, joint torques, and contact forces. We have demonstrated the power and effectiveness of our system by generating a wide variety of physically realistic human actions from uncalibrated monocular video sequences such as sports video footage.

References:

1. Agarwal, A., and Triggs, B. 2006. Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). 28(1):44–58. Google ScholarDigital Library
2. Bazaraa, M. S., Sherali, H. D., and Shetty, C. M. 1993. Nonlinear Programming: Theory and Algorithms. John Wiley and Sons Ltd. 2nd Edition.Google Scholar
3. Bregler, C., Malik, J., and Pullen, K. 2004. Twist Based Acquisition and Tracking of Animal and Human Kinematics. International Journal of Computer Vision. 56(3):179–194. Google ScholarDigital Library
4. Brubaker, M. A., and Fleet, D. J. 2008. The Kneed Walker for human pose tracking. In Proceedings of IEEE CVPR. 1–8.Google Scholar
5. Chai, J., and Hodgins, J. 2005. Performance Animation from Low-dimensional Control Signals. In ACM Transactions on Graphics. 24(3):686–696. Google ScholarDigital Library
6. Chen, Y.-L., and Chai, J. 2009. Simultaneous Reconstruction of 3D Human Skeleton and Motion from Monocular Video Sequences. Proceedings of The Ninth Asian Conference on Computer Vision.Google Scholar
7. Cohen, M. F. 1992. Interactive Spacetime Control for Animation. In Proceedings of ACM SIGGRAPH 1992. 293–302. Google ScholarDigital Library
8. Comaniciu, D., and Meer, P. 2002. Mean Shift: A Robust Approach Toward Feature Space Analysis. In IEEE Trans. Pattern Analysis and Machine Intelligence. 24(5):603–619. Google ScholarDigital Library
9. Cowley, A., and Taylor, C. J., 2001. Videomocap: A video based motion capture system. http://www.cis.upenn.edu/cjtaylor/RESEARCH/projects/Johansson/VideoMoCap.html.Google Scholar
10. DiFranco, D. E., Cham, T.-J., and Rehg, J. M. 2001. Reconstruction of 3D figure motion from 2D correspondences. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1:307–314.Google Scholar
11. Elgammal, A., and Lee, C. 2004. Inferring 3D body pose from silhouettes using activity manifold learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2: 681–688. Google ScholarDigital Library
12. Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., and Stahel, W. A. 1986. Robust Statistics: The Approach Based on Influence Functions. Wiley.Google Scholar
13. Howe, N., Leventon, M., and Freeman, W. 1999. Bayesian Reconstruction of 3D Human Motion from Single-camera Video. In Advances in Neural Information Processing Systems 12. 820–826.Google Scholar
14. Huber, P. J. 1981. Robust Statistics. Wiley.Google Scholar
15. Kanaujia, C. S. A., and Metaxas, D. 2007. BM3E: Discriminative density propagation for visual tracking. In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). 29(11):2030–2044. Google ScholarDigital Library
16. Liu, K., Hertzmann, A., and Popović, Z. 2005. Learning Physics-Based Motion Style with Nonlinear Inverse Optimization. In ACM Transactions on Graphics. 23(3):1071–1081. Google ScholarDigital Library
17. Lourakis, M. I. A. 2009. levmar: Levenberg marquardt nonlinear least squares algorithms in c/c++. In http://www.ics.forth.gr/lourakis/levmar/.Google Scholar
18. Loy, G., Eriksson, M., Sullivan, J., and Carlsson, S. 2004. Monocular 3D Reconstruction of Human motion in Long Action Sequences. In European Conference on Computer Vision. 442–455.Google Scholar
19. MatchMover, 2008. http://www.realviz.com/.Google Scholar
20. Pavlović, V., Rehg, J. M., and MacCormick, J. 2000. Learning Switching Linear Models of Human Motion. In Advances in Neural Information Processing Systems 13, 981–987.Google Scholar
21. Pollard, N., and Reitsma, P. 2001. Animation of Human-like Characters: Dynamic Motion Filtering with A Physically Plausible Contact Model. In In Yale Workshop on Adaptive and Learning Systems.Google Scholar
22. Popović, Z., and Witkin, A. P. 1999. Physically Based Motion Transformation. In Proceedings of ACM SIGGRAPH 1999. 11–20. Google ScholarDigital Library
23. Rosales, R., and Sclaroff, S. 2000. Specialized Mappings and the Estimation of Human Body Pose from a Single Image. In Proceedings of the Workshop on Human Motion. 19–24. Google ScholarDigital Library
24. Safonova, A., Hodgins, J., and Pollard, N. 2004. Synthesizing Physically Realistic Human Motion in Low-Dimensional, Behavior-Specific Spaces. In ACM Transactions on Graphics. 23(3):514–521. Google ScholarDigital Library
25. Sidenbladh, H., Black, M. J., and Sigal, L. 2002. Implicit Probabilistic Models of Human Motion for Synthesis and Tracking. In European Conference on Computer Vision. 784–800. Google ScholarDigital Library
26. Sminchisescu, C., and Jepson, A. 2004. Generative Modeling for Continuous Non-Linearly Embedded Visual Inference. In ICML, 759–766. Google ScholarDigital Library
27. Sulejmanpasic, A., and Popović, J. 2005. Adaptation of Performed Ballistic Motion. In ACM Transactions on Graphics. 24(1):165–179. Google ScholarDigital Library
28. Taylor, C. J. 2000. Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image. In Computer Vision and Image Understanding. 80(3):349–363. Google ScholarDigital Library
29. Urtasun, R., Fleet, D. J., Hertzmann, A., and Fua., P. 2005. Priors for people tracking from small training sets. In IEEE International Conference on Computer Vision, 403–410. Google ScholarDigital Library
30. Vicon Systems, 2009. http://www.vicon.com.Google Scholar
31. Vondrak, M., Sigal, L., and Jenkins, O. C. 2008. Physical simulation for probabilistic motion tracking. In IEEE Conference on Computer Vision and Pattern Recognition, 1–8.Google Scholar
32. Wei, X., and Chai, J. 2008. Interactive Tracking of 2D Generic Objects with Spacetime Optimization. In Proceedings of European Conference on Computer Vision. 1:657–670. Google ScholarDigital Library
33. Wei, X., and Chai, J. 2009. Modeling 3D Human Poses from Uncalibrated Monocular Images. Proceedings of IEEE Conference on Computer Vision.Google Scholar
34. Witkin, A., and Kass, M. 1988. Spacetime Constraints. In Proceedings of ACM SIGGRAPH 1998. 159–168. Google ScholarDigital Library

ACM Digital Library Publication: