“Real-time hand-tracking with a color glove” by Wang and Popović

  • ©Robert Y. Wang and Jovan Popović




    Real-time hand-tracking with a color glove



    Articulated hand-tracking systems have been widely used in virtual reality but are rarely deployed in consumer applications due to their price and complexity. In this paper, we propose an easy-to-use and inexpensive system that facilitates 3-D articulated user-input using the hands. Our approach uses a single camera to track a hand wearing an ordinary cloth glove that is imprinted with a custom pattern. The pattern is designed to simplify the pose estimation problem, allowing us to employ a nearest-neighbor approach to track hands at interactive rates. We describe several proof-of-concept applications enabled by our system that we hope will provide a foundation for new interactions in modeling, animation control and augmented reality.


    1. Agrawala, M., Beers, A. C., Fröhlich, B., Hanrahan, P. M., McDowall, I., and Bolas, M. 1997. The two-user responsive workbench: Support for collaboration through independent views of a shared space. In Proceedings of SIGGRAPH 97, 327–332. Google ScholarDigital Library
    2. Athitsos, V., and Sclaroff, S. 2003. Estimating 3D hand pose from a cluttered image. In Computer Vision and Pattern Recognition (CVPR), vol. 2, 432–439.Google Scholar
    3. Athitsos, V., Alon, J., Sclaroff, S., and Kollios, G. 2004. BoostMap: A method for efficient approximate similarity rankings. In Computer Vision and Pattern Recognition (CVPR), vol. 2, 268–275. Google ScholarDigital Library
    4. Barnes, C., Jacobs, D. E., Sanders, J., Goldman, D. B., Rusinkiewicz, S., Finkelstein, A., and Agrawala, M. 2008. Video puppetry: a performative interface for cutout animation. ACM Transactions on Graphics 27, 5, 1–9. Google ScholarDigital Library
    5. Benko, H., Ishak, E., and Feiner, S. 2005. Cross-Dimensional Gestural Interaction Techniques for Hybrid Immersive Environments. In IEEE Virtual Reality Conference, 209–216. Google ScholarDigital Library
    6. Chong, H. Y., Gortler, S. J., and Zickler, T. 2008. A perception-based color space for illumination-invariant image processing. ACM Transactions on Graphics 27, 3, 1–7. Google ScholarDigital Library
    7. de La Gorce, M., Paragios, N., and Fleet, D. 2008. Model-Based Hand Tracking with Texture, Shading and Self-occlusions. In Computer Vision and Pattern Recognition (CVPR), 1–8.Google Scholar
    8. Dewaele, G., Devernay, F., Horaud, R. P., and Forbes, F. 2006. The alignment between 3-d data and articulated shapes with bending surfaces. In European Conference on Computer Vision (ECCV), 578–591. Google ScholarDigital Library
    9. Dhawale, P., Masoodian, M., and Rogers, B. 2006. Barehand 3D gesture input to interactive systems. In New Zealand Chapter’s International Conference on Computer-Human Interaction: Design Centered HCI (CHINZ), 25–32. Google ScholarDigital Library
    10. Dontcheva, M., Yngve, G., and Popović, Z. 2003. Layered acting for character animation. ACM Transactions on Graphics 22, 3, 409–416. Google ScholarDigital Library
    11. Dorner, B. 1994. Chasing the Colour Glove: Visual Hand Tracking. Master’s thesis, Simon Fraser University.Google Scholar
    12. Grossman, T., Wigdor, D., and Balakrishnan, R. 2004. Multi-finger gestural interaction with 3d volumetric displays. In User Interface Software and Technology (UIST), 61–70. Google ScholarDigital Library
    13. Guskov, I., Klibanov, S., and Bryant, B. 2003. Trackable surfaces. In Symposium on Computer Animation (SCA), 251–257. Google ScholarDigital Library
    14. Keefe, D. F., Karelitz, D. B., Vote, E. L., and Laidlaw, D. H. 2005. Artistic collaboration in designing vr visualizations. IEEE Computer Graphics and Applications 25, 2, 18–23. Google ScholarDigital Library
    15. Kersten, D., Mamassian, P., and Knill, D. 1997. Moving cast shadows induce apparent motion in depth. Perception 26, 2, 171–192.Google ScholarCross Ref
    16. Kry, P., and Pai, D. 2006. Interaction capture and synthesis. ACM Transactions on Graphics 25, 3, 872–880. Google ScholarDigital Library
    17. Kry, P., Pihuit, A., Bernhardt, A., and Cani, M. 2008. HandNavigator: hands-on interaction for desktop virtual reality. In Virtual Reality Software and Technology (VRST), 53–60. Google ScholarDigital Library
    18. Lam, W., Zou, F., and Komura, T. 2004. Motion editing with data glove. In International Conference on Advances in Computer Entertainment Technology, 337–342. Google ScholarDigital Library
    19. Lee, C.-S., Ghyme, S.-W., Park, C.-J., and Wohn, K. 1998. The control of avatar motion using hand gesture. In Virtual Reality Software and Technology (VRST), 59–65. Google ScholarDigital Library
    20. Li, Y., Fu, J. L., and Pollard, N. S. 2007. Data-driven grasp synthesis using shape matching and task-based pruning. IEEE Transactions Visualization and Computer Graphics 13, 4, 732–747. Google ScholarDigital Library
    21. Olwal, A., Benko, H., and Feiner, S. 2003. Senseshapes: Using statistical geometry for object selection in a multimodal augmented reality system. In International Symposium on Mixed and Augmented Reality (ISMAR), 300. Google ScholarDigital Library
    22. Park, J., and Yoon, Y. 2006. LED-glove based interactions in multi-modal displays for teleconferencing. In International Conference on Artificial Reality and Telexistence-Workshops (ICAT), 395–399. Google ScholarDigital Library
    23. Pollard, N., and Zordan, V. 2005. Physically based grasping control from example. In Symposium on Computer Animation (SCA), 311–318. Google ScholarDigital Library
    24. Ren, L., Shakhnarovich, G., Hodgins, J., Pfister, H., and Viola, P. 2005. Learning silhouette features for control of human motion. ACM Transactions on Graphics 24, 4, 1303–1331. Google ScholarDigital Library
    25. Schlattman, M., and Klein, R. 2007. Simultaneous 4 gestures 6 dof real-time two-hand tracking without any markers. In Virtual Reality Software and Technology (VRST), 39–42. Google ScholarDigital Library
    26. Scholz, V., Stich, T., Keckeisen, M., Wacker, M., and Magnor, M. A. 2005. Garment motion capture using color-coded patterns. Computer Graphics Forum 24, 3, 439–447.Google ScholarCross Ref
    27. Shakhnarovich, G., Viola, P., and Darrell, T. 2003. Fast pose estimation with parameter-sensitive hashing. In International Conference on Computer Vision (ICCV), 750–757. Google ScholarDigital Library
    28. Sheng, J., Balakrishnan, R., and Singh, K. 2006. An interface for virtual 3d sculpting via physical proxy. In Computer Graphics and Interactive Techniques in Australasia and Southeast Asia (GRAPHITE), 213–220. Google ScholarDigital Library
    29. Shiratori, T., and Hodgins, J. K. 2008. Accelerometer-based user interfaces for the control of a physically simulated character. ACM Transactions on Graphics 27, 5, 1–9. Google ScholarDigital Library
    30. Starner, T., Weaver, J., and Pentland, A. 1998. Real-time American sign language recognition using desk and wearable computer based video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1371–1375. Google ScholarDigital Library
    31. Stenger, B., Thayananthan, A., Torr, P., and Cipolla, R. 2006. Model-based hand tracking using a hierarchical bayesian filter. IEEE Transactions Pattern Analysis and Machine Intelligence 28, 9, 1372–1384. Google ScholarDigital Library
    32. Sturman, D. J., and Zeltzer, D. 1993. A design method for “whole-hand” human-computer interaction. ACM Transactions on Information Systems 11, 3, 219–238. Google ScholarDigital Library
    33. Sudderth, E. B., Mandel, M. I., Freeman, T., and Willsky, S. 2004. Distributed occlusion reasoning for tracking with nonparametric belief propagation. In Neural Information Processing Systems (NIPS), 1369–1376.Google Scholar
    34. Theobalt, C., Albrecht, I., Haber, J., Magnor, M., and Seidel, H.-P. 2004. Pitching a baseball — tracking highspeed motion with multi-exposure images. ACM Transactions on Graphics 23, 3, 540–547. Google ScholarDigital Library
    35. Torralba, A., Fergus, R., and Freeman, W. T. 2007. Tiny images. Tech. Rep. MIT-CSAIL-TR-2007-024, Computer Science and Artificial Intelligence Lab, MIT.Google Scholar
    36. Torralba, A., Fergus, R., and Weiss., Y. 2008. Small codes and large databases for recognition. In Computer Vision and Pattern Recognition (CVPR), vol. 2, 1–8.Google Scholar
    37. Wesche, G., and Seidel, H.-P. 2001. Freedrawer: a freeform sketching system on the responsive workbench. In Virtual Reality Software and Technology (VRST), 167–174. Google ScholarDigital Library
    38. White, R., and Forsyth, D. 2005. Deforming objects provide better camera calibration. Tech. Rep. UCB/EECS-2005-3, EECS Department, University of California, Berkeley.Google Scholar
    39. White, R., Crane, K., and Forsyth, D. A. 2007. Capturing and animating occluded cloth. ACM Transactions on Graphics 26, 3, 34:1–34:8. Google ScholarDigital Library
    40. Wilson, A., Izadi, S., Hilliges, O., Garcia-Mendoza, A., and Kirk, D. 2008. Bringing physics to the surface. In User Interface Software and Technology (UIST), 67–76. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page: