“Video-based hand manipulation capture through composite motion control” by Wang, Min, Zhang, Liu, Dai, et al. …

  • ©Yangang Wang, Jianyuan Min, Jianjie Zhang, Yebin Liu, Qionghai Dai, and Jinxiang Chai




    Video-based hand manipulation capture through composite motion control


Session Title: Faces & Hands



    This paper describes a new method for acquiring physically realistic hand manipulation data from multiple video streams. The key idea of our approach is to introduce a composite motion control to simultaneously model hand articulation, object movement, and subtle interaction between the hand and object. We formulate video-based hand manipulation capture in an optimization framework by maximizing the consistency between the simulated motion and the observed image data. We search an optimal motion control that drives the simulation to best match the observed image data. We demonstrate the effectiveness of our approach by capturing a wide range of high-fidelity dexterous manipulation data. We show the power of our recovered motion controllers by adapting the captured motion data to new objects with different properties. The system achieves superior performance against alternative methods such as marker-based motion capture and kinematic hand motion tracking.


    1. Athitsos, V., and Sclaroff, S. 2003. Estimating 3d hand pose from a cluttered image. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, CVPR ’03, 432–439.Google Scholar
    2. Ballan, L., Taneja, A., Gall, J., Gool, L. V., and Pollefeys, M. 2012. Motion capture of hands in action using discriminative salient points. In European Conference on Computer Vision (ECCV), 640–653. Google ScholarDigital Library
    3. Brubaker, M. A., and Fleet, D. J. 2008. The Kneed Walker for human pose tracking. In Proceedings of IEEE CVPR, 1–9.Google Scholar
    4. Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Pattern Analysis and Machine Intelligence. 8(6):679–698. Google ScholarDigital Library
    5. Cottle, R., Pang, J., and Stone, R. 2009. The linear complementarity problem. Society for Industrial Mathematics.Google Scholar
    6. CyberGlove, 2012. http://www.cyberglovesystems.com/.Google Scholar
    7. de La Gorce, M., Fleet, D. J., and Paragios, N. 2011. Model-based 3d hand pose estimation from monocular video. In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, 1793–1805. Google ScholarDigital Library
    8. Debevec, P., Yu, Y., and Borshukov, G. 1998. Efficient view-dependent image-based rendering with projective texture-mapping. In 9th Eurographics Rendering Workshop, Springer.Google Scholar
    9. Gall, J., Rosenhahn, B., and Seidel, H.-P. 2008. Human motion – understanding, modeling, capture and animation. Computational Imaging and Vision. 36: 319–345.Google ScholarCross Ref
    10. Hoyet, L., Ryall, K., McDonnell, R., and O’Sullivan, C. 2012. Sleight of hand: perception of finger motion from reduced marker sets. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, I3D ’12, 79–86. Google ScholarDigital Library
    11. Kry, P. G., and Pai, D. K. 2006. Interaction capture and synthesis. ACM Trans. Graph. 25, 3, 872–880. Google ScholarDigital Library
    12. Lee, S.-H., and Goswami, A. 2010. Ground reaction force control at each foot: A momentum-based humanoid balance controller for non-level and non-stationary ground. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, IEEE, 3157–3162.Google Scholar
    13. Liu, L., Yin, K., van de Panne, M., Shao, T., and Xu, W. 2010. Sampling-based contact-rich motion control. ACM Trans. Graph. 29, 4, 128:1–128:10. Google ScholarDigital Library
    14. Liu, C. K. 2008. Synthesis of interactive hand manipulation. In Proceedings of the 2008 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’08, 163–171. Google ScholarDigital Library
    15. Liu, C. K. 2009. Dextrous manipulation from a grasping pose. ACM Trans. Graph. 28, 3, 59:1–59:6. Google ScholarDigital Library
    16. Oikonomidis, I., Kyriazis, N., and Argyros, A. A. 2011. Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In Proceedings of IEEE International Conference on Computer Vision (ICCV), 2088–2095. Google ScholarDigital Library
    17. Palmer, J. A., Kreutz-Delgado, K., and Makeig, S. 2006. Super-gaussian mixture source model for ica. In Independent Component Analysis and Blind Signal Separation. Springer, 854–861. Google ScholarDigital Library
    18. Pollard, N. S., and Zordan, V. B. 2005. Physically based grasping control from example. In Proceedings of the 2005 ACM SIGGRAPH/Eurographics symposium on Computer animation, SCA ’05, 311–318. Google ScholarDigital Library
    19. Rehg, J., and Kanade, T. 1995. Model-based tracking of self-occluding articulated objects. In Proceedings of the Fifth IEEE International Conference on Computer Vision, ICCV ’95, 612–617. Google ScholarDigital Library
    20. Romero, J., Kjellstrom, H., and Kragic, D. 2010. Hands in action: real-time 3d reconstruction of hands in interaction with objects. In IEEE International Conference on Robotics and Automation (ICRA), 458–463.Google Scholar
    21. Sueda, S., Kaufman, A., and Pai, D. K. 2008. Musculotendon simulation for hand animation. ACM Trans. Graph. 27, 3, 83:1–83:8. Google ScholarDigital Library
    22. Tsang, W., Singh, K., and Fiume, E. 2005. Helping hand: an anatomically accurate inverse dynamics solution for unconstrained hand motion. In Proceedings of the 2005 ACM SIGGRAPH/Eurographics symposium on Computer animation, ACM, 319–328. Google ScholarDigital Library
    23. Vicon Systems, 2012. http://www.vicon.com.Google Scholar
    24. Vondrak, M., Sigal, L., and Jenkins, O. C. 2008. Physical simulation for probabilistic motion tracking. In IEEE Conference on Computer Vision and Pattern Recognition, 1–8.Google Scholar
    25. Vondrak, M., Sigal, L., Hodgins, J., and Jenkins, O. 2012. Video-based 3d motion capture through biped control. ACM Trans. Graph. 31, 4 (July), 27:1–27:12. Google ScholarDigital Library
    26. Wang, R. Y., and Popović, J. 2009. Real-time hand-tracking with a color glove. ACM Trans. Graph. 28, 3, 63:1–63:8. Google ScholarDigital Library
    27. Wei, X., and Chai, J. 2010. Videomocap: modeling physically realistic human motion from monocular video sequences. ACM Trans. Graph. 29, 4 (July), 42:1–42:10. Google ScholarDigital Library
    28. Wu, Y., Lin, J., and Huang, T. S. 2001. Capturing natural hand articulation. In Proceedings of IEEE International Conference on Computer Vision (ICCV), 426–432.Google Scholar
    29. Ye, Y., and Liu, C. K. 2012. Synthesis of detailed hand manipulations using contact sampling. ACM Trans. Graph. 31, 4 (July), 41:1–41:10. Google ScholarDigital Library
    30. Zhang, Z. 1999. Flexible camera calibration by viewing a plane from unknown orientations. In Proceedings of the International Conference on Computer Vision. 666–673.Google ScholarCross Ref
    31. Zhao, W., Chai, J., and Xu, Y.-Q. 2012. Combining marker-based mocap and rgb-d camera for acquiring high-fidelity hand motion data. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’12, 33–42. Google ScholarDigital Library
    32. Zhou, H., and Huang, T. S. 2003. Tracking articulated hand motion with eigen dynamics analysis. In Proceedings of the Ninth IEEE International Conference on Computer Vision, ICCV ’03, 1102–1109. Google ScholarDigital Library

ACM Digital Library Publication: