“NIMBLE: a non-rigid hand model with bones and muscles” by Li, Zhang, Qiu, Jiang, Li, et al. …

  • ©Yuwei Li, Longwen Zhang, Zesong Qiu, Yingwenqi Jiang, Nianyi Li, Yuexin Ma, Yuyao Zhang, Lan Xu, and Jingyi Yu




    NIMBLE: a non-rigid hand model with bones and muscles



    Emerging Metaverse applications demand reliable, accurate, and photorealistic reproductions of human hands to perform sophisticated operations as if in the physical world. While real human hand represents one of the most intricate coordination between bones, muscle, tendon, and skin, state-of-the-art techniques unanimously focus on modeling only the skeleton of the hand. In this paper, we present NIMBLE, a novel parametric hand model that includes the missing key components, bringing 3D hand model to a new level of realism. We first annotate muscles, bones and skins on the recent Magnetic Resonance Imaging hand (MRI-Hand) dataset [Li et al. 2021] and then register a volumetric template hand onto individual poses and subjects within the dataset. NIMBLE consists of 20 bones as triangular meshes, 7 muscle groups as tetrahedral meshes, and a skin mesh. Via iterative shape registration and parameter learning, it further produces shape blend shapes, pose blend shapes, and a joint regressor. We demonstrate applying NIMBLE to modeling, rendering, and visual inference tasks. By enforcing the inner bones and muscles to match anatomic and kinematic rules, NIMBLE can animate 3D hands to new poses at unprecedented realism. To model the appearance of skin, we further construct a photometric HandStage to acquire high-quality textures and normal maps to model wrinkles and palm print. Finally, NIMBLE also benefits learning-based hand pose and shape estimation by either synthesizing rich data or acting directly as a differentiable layer in the inference network.


    1. 3DSCANSTORE. 2022. 3D Scan Store: Captured Assets for Digital Artists. https://www.3dscanstore.com/Google Scholar
    2. Rinat Abdrashitov, Seungbae Bang, David Levin, Karan Singh, and Alec Jacobson. 2021. Interactive Modelling of Volumetric Musculoskeletal Anatomy. ACM Trans. Graph. 40, 4, Article 122 (jul 2021), 13 pages. Google ScholarDigital Library
    3. Irene Albrecht, Jörg Haber, and Hans-Peter Seidel. 2003. Construction and animation of anatomically based human hand models. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation. Citeseer, 98–109.Google ScholarDigital Library
    4. Brett Allen, Brian Curless, and Zoran Popović. 2003. The Space of Human Body Shapes: Reconstruction and Parameterization from Range Scans. ACM Trans. Graph. 22, 3 (jul 2003), 587–594. Google ScholarDigital Library
    5. Brett Allen, Brian Curless, Zoran Popović, and Aaron Hertzmann. 2006. Learning a Correlated Model of Identity and Pose-Dependent Body Shape Variation for Real-Time Synthesis. In Proceedings of the 2006 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (Vienna, Austria) (SCA ’06). Eurographics Association, Goslar, DEU, 147–156.Google ScholarDigital Library
    6. Pierre Alliez, Eric Colin De Verdire, Olivier Devillers, and Martin Isenburg. 2003. Isotropic surface remeshing. In 2003 Shape Modeling International. IEEE, 49–58.Google Scholar
    7. Amira. 2022. Amira Software for biomedical and life science research. https://www.thermofisher.com/hk/en/home/electron-microscopy/products/software-em-3d-vis/amira-software.htmlGoogle Scholar
    8. E. M. A. Anas, A. Rasoulian, A. Seitel, K. Darras, D. Wilson, P. S. John, D. Pichora, P. Mousavi, R. Rohling, and P. Abolmaesumi. 2016. Automatic Segmentation of Wrist Bones in CT Using a Statistical Wrist Shape + Pose Model. IEEE Transactions on Medical Imaging 35, 8 (2016), 1789–1801. Google ScholarCross Ref
    9. Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, and James Davis. 2005. SCAPE: shape completion and animation of people. In ACM SIGGRAPH 2005 Papers. 408–416.Google ScholarDigital Library
    10. Seungryul Baek, Kwang In Kim, and Tae-Kyun Kim. 2019. Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
    11. Luca Ballan, Aparna Taneja, Jürgen Gall, Luc Van Gool, and Marc Pollefeys. 2012. Motion capture of hands in action using discriminative salient points. In European Conference on Computer Vision. Springer, 640–653.Google ScholarDigital Library
    12. Volker Blanz and Thomas Vetter. 1999. A morphable model for the synthesis of 3D faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. 187–194.Google ScholarDigital Library
    13. Blender. 2021. Cycles renderer.Google Scholar
    14. Gunilla Borgefors. 1983. Chamfering: A fast method for obtaining approximations of the Euclidean distance in N dimensions. In Proc. 3rd Scand. Conf. on Image Analysis (SCIA3). 250–255.Google Scholar
    15. Steve Capell, Matthew Burkhart, Brian Curless, Tom Duchamp, and Zoran Popović. 2005. Physically based rigging for deformable characters. In Proceedings of the 2005 ACM SIGGRAPH/Eurographics symposium on Computer animation. 301–310.Google ScholarDigital Library
    16. Martin de la gorce, David Fleet, and Nikos Paragios. 2011. Model-Based 3D Hand Pose Estimation from Monocular Video. Pattern Analysis and Machine Intelligence, IEEE Transactions on 33 (10 2011), 1793 — 1805. Google ScholarDigital Library
    17. Paul Debevec. 2012. The light stages and their applications to photoreal digital actors. SIGGRAPH Asia 2, 4 (2012), 1–6.Google Scholar
    18. Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, and Mark Sagar. 2000. Acquiring the reflectance field of a human face. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques. 145–156.Google ScholarDigital Library
    19. Caroline Erolin, Clare Lamb, Roger Soames, and Caroline Wilkinson. 2016. Does Virtual Haptic Dissection Improve Student Learning? A Multi-Year Comparative Study.. In MMVR. 110–117.Google Scholar
    20. Yao Feng, Haiwen Feng, Michael J Black, and Timo Bolkart. 2021. Learning an animatable detailed 3D face model from in-the-wild images. ACM Transactions on Graphics (TOG) 40, 4 (2021), 1–13.Google ScholarDigital Library
    21. Nils Hasler, Thorsten Thormählen, Bodo Rosenhahn, and Hans-Peter Seidel. 2010. Learning Skeletons for Shape and Pose. In Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (Washington, D.C.) (I3D ’10). Association for Computing Machinery, New York, NY, USA, 23–30. Google ScholarDigital Library
    22. Yana Hasson, Gul Varol, Dimitrios Tzionas, Igor Kalevatykh, Michael J Black, Ivan Laptev, and Cordelia Schmid. 2019. Learning joint reconstruction of hands and manipulated objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11807–11816.Google ScholarCross Ref
    23. Gentaro Hirota, Susan Fisher, A State, Chris Lee, and Henry Fuchs. 2001. An implicit finite element method for elastic solids in contact. In Proceedings Computer Animation 2001. Fourteenth Conference on Computer Animation (Cat. No. 01TH8596). IEEE, 136–254.Google ScholarCross Ref
    24. David A Hirshberg, Matthew Loper, Eric Rachlin, and Michael J Black. 2012. Coregistration: Simultaneous alignment and modeling of articulated 3D shape. In European conference on computer vision. Springer, 242–255.Google ScholarDigital Library
    25. Justin Johnson, Nikhila Ravi, Jeremy Reizenstein, David Novotny, Shubham Tulsiani, Christoph Lassner, and Steve Branson. 2020. Accelerating 3D Deep Learning with PyTorch3D. In SIGGRAPH Asia 2020 Courses (Virtual Event) (SA ’20). Association for Computing Machinery, New York, NY, USA, Article 10, 1 pages. Google ScholarDigital Library
    26. Petr Kadleček, Alexandru-Eugen Ichim, Tiantian Liu, Jaroslav Křivánek, and Ladislav Kavan. 2016. Reconstructing personalized anatomical models for physics-based body animation. ACM Transactions on Graphics (TOG) 35, 6 (2016), 1–13.Google ScholarDigital Library
    27. S. Khamis, Jonathan Taylor, Jamie Shotton, Cem Keskin, Shahram Izadi, and Andrew W. Fitzgibbon. 2015. Learning an efficient model of hand shape variation from depth images. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), 2540–2548.Google ScholarCross Ref
    28. Junggon Kim and Nancy S. Pollard. 2011. Fast Simulation of Skeleton-Driven Deformable Body Characters. ACM Trans. Graph. 30, 5, Article 121 (oct 2011), 19 pages. Google ScholarDigital Library
    29. Paul G Kry, Doug L James, and Dinesh K Pai. 2002. Eigenskin: real time large deformation character skinning in hardware. In Proceedings of the 2002 ACM SIGGRAPH/Eurographics symposium on Computer animation. 153–159.Google ScholarDigital Library
    30. Seunghwan Lee, Ri Yu, Jungnam Park, Mridul Aanjaneya, Eftychios Sifakis, and Jehee Lee. 2018. Dexterous manipulation and control with volumetric muscles. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–13.Google ScholarDigital Library
    31. J. P. Lewis, Matt Cordner, and Nickson Fong. 2000. Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’00). ACM Press/Addison-Wesley Publishing Co., USA, 165–172. Google ScholarDigital Library
    32. Duo Li, Shinjiro Sueda, Debanga R. Neog, and Dinesh K. Pai. 2013. Thin Skin Elastodynamics. ACM Trans. Graph. 32, 4, Article 49 (jul 2013), 10 pages. Google ScholarDigital Library
    33. Tianye Li, Timo Bolkart, Michael J. Black, Hao Li, and Javier Romero. 2017. Learning a Model of Facial Shape and Expression from 4D Scans. ACM Trans. Graph. 36, 6, Article 194 (nov 2017), 17 pages. Google ScholarDigital Library
    34. Yuwei Li, Minye Wu, Yuyao Zhang, Lan Xu, and Jingyi Yu. 2021. PIANO: A Parametric Hand Bone Model from Magnetic Resonance Imaging. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21. 816–822. Google ScholarCross Ref
    35. Libin Liu, KangKang Yin, Bin Wang, and Baining Guo. 2013. Simulation and Control of Skeleton-Driven Soft Body Characters. ACM Trans. Graph. 32, 6, Article 215 (nov 2013), 8 pages. Google ScholarDigital Library
    36. Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. 2015. SMPL: A skinned multi-person linear model. ACM transactions on graphics (TOG) 34, 6 (2015), 1–16.Google ScholarDigital Library
    37. William E Lorensen and Harvey E Cline. 1987. Marching cubes: A high resolution 3D surface construction algorithm. ACM siggraph computer graphics 21, 4 (1987), 163–169.Google Scholar
    38. N. Magnenat-Thalmann, R. Laperrière, and D. Thalmann. 1989. Joint-Dependent Local Deformations for Hand Animation and Object Grasping. In Proceedings on Graphics Interface ’88 (Edmonton, Alberta, Canada). Canadian Information Processing Society, CAN, 26–33.Google Scholar
    39. Stan Melax, Leonid Keselman, and Sterling Orsten. 2013. Dynamics based 3D skeletal hand tracking. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games. 184–184.Google ScholarDigital Library
    40. M Mirakhorlo, N Van Beek, M Wesseling, H Maas, HEJ Veeger, and I Jonkers. 2018. A musculoskeletal model of the hand and wrist: model definition and evaluation. Computer methods in biomechanics and biomedical engineering 21, 9 (2018), 548–557.Google Scholar
    41. Gyeongsik Moon and Kyoung Mu Lee. 2020. I2l-meshnet: Image-to-lixel prediction network for accurate 3d human pose and mesh estimation from a single rgb image. In European Conference on Computer Vision. Springer, 752–768.Google ScholarDigital Library
    42. Gyeongsik Moon, Takaaki Shiratori, and Kyoung Mu Lee. 2020a. DeepHandMesh: A Weakly-Supervised Deep Encoder-Decoder Framework for High-Fidelity Hand Mesh Modeling. 440–455. Google ScholarDigital Library
    43. Gyeongsik Moon, Shoou-I Yu, He Wen, Takaaki Shiratori, and Kyoung Mu Lee. 2020b. InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image. In European Conference on Computer Vision (ECCV).Google ScholarDigital Library
    44. Franziska Mueller, Micah Davis, Florian Bernard, Oleksandr Sotnychenko, Mickeal Verschoor, Miguel A. Otaduy, Dan Casas, and Christian Theobalt. 2019. Real-time Pose and Shape Reconstruction of Two Interacting Hands With a Single Depth Camera. ACM Transactions on Graphics (TOG) 38, 4 (2019).Google ScholarDigital Library
    45. Richard A Newcombe, Dieter Fox, and Steven M Seitz. 2015. Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In Proceedings of the IEEE conference on computer vision and pattern recognition. 343–352.Google ScholarCross Ref
    46. Iasonas Oikonomidis, Nikolaos Kyriazis, and Antonis A. Argyros. 2011. Efficient model-based 3D tracking of hand articulations using Kinect. In BMVC.Google Scholar
    47. Nobuyuki Otsu. 1979. A threshold selection method from gray level histograms. IEEE Transactions on Systems, Man, and Cybernetics 9 (1979), 62–66.Google ScholarCross Ref
    48. Surbhi Panchal-Kildare and Kevin Malone. 2013. Skeletal anatomy of the hand. Hand clinics 29, 4 (2013), 459–471.Google Scholar
    49. Gerard Pons-Moll, Javier Romero, Naureen Mahmood, and Michael J Black. 2015. Dyna: A model of dynamic human shape in motion. ACM Transactions on Graphics (TOG) 34, 4 (2015), 1–14.Google ScholarDigital Library
    50. Neng Qian, Jiayi Wang, Franziska Mueller, Florian Bernard, Vladislav Golyanik, and Christian Theobalt. 2020. HTML: A Parametric Hand Texture Model for 3D Hand Reconstruction and Personalization. In European Conference on Computer Vision. Springer, 54–71.Google ScholarDigital Library
    51. R3DS. 2022. WRAP3D. https://www.russian3dscanner.com/Google Scholar
    52. Taehyun Rhee, John P Lewis, Ulrich Neumann, and Krishna Nayak. 2007. Soft-tissue deformation for in vivo volume animation. In 15th Pacific Conference on Computer Graphics and Applications (PG’07). IEEE, 435–438.Google ScholarDigital Library
    53. Javier Romero, Dimitrios Tzionas, and Michael J Black. 2017. Embodied hands: Modeling and capturing hands and bodies together. ACM Transactions on Graphics (ToG) 36, 6 (2017), 245.Google ScholarDigital Library
    54. Prashant Sachdeva, Shinjiro Sueda, Susanne Bradley, Mikhail Fain, and Dinesh K. Pai. 2015. Biomechanical Simulation and Control of Hands and Tendinous Systems. ACM Trans. Graph. 34, 4, Article 42 (jul 2015), 10 pages. Google ScholarDigital Library
    55. Tanner Schmidt, Richard A. Newcombe, and Dieter Fox. 2014. DART: Dense Articulated Real-Time Tracking. In Robotics: Science and Systems.Google Scholar
    56. Robert J Schwarz and C Taylor. 1955. The anatomy and mechanics of the human hand. Artificial limbs 2, 2 (1955), 22–35.Google Scholar
    57. Hang Si. 2015. TetGen, a Delaunay-based quality tetrahedral mesh generator. ACM Transactions on Mathematical Software (TOMS) 41, 2 (2015), 1–36.Google ScholarDigital Library
    58. Breannan Smith, Fernando De Goes, and Theodore Kim. 2018. Stable neo-hookean flesh simulation. ACM Transactions on Graphics (TOG) 37, 2 (2018), 1–15.Google ScholarDigital Library
    59. Breannan Smith, Chenglei Wu, He Wen, Patrick Peluse, Yaser Sheikh, Jessica K Hodgins, and Takaaki Shiratori. 2020. Constraining dense hand surface tracking with elasticity. ACM Transactions on Graphics (TOG) 39, 6 (2020), 1–14.Google ScholarDigital Library
    60. Shinjiro Sueda, Andrew Kaufman, and Dinesh K. Pai. 2008. Musculotendon Simulation for Hand Animation. ACM Trans. Graph. 27, 3 (aug 2008), 1–8. Google ScholarDigital Library
    61. Anastasia Tkach, Mark Pauly, and Andrea Tagliasacchi. 2016. Sphere-Meshes for Real-Time Hand Modeling and Tracking. ACM Trans. Graph. 35, 6, Article 222 (nov 2016), 11 pages. Google ScholarDigital Library
    62. Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri. 2018. A closer look at spatiotemporal convolutions for action recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 6450–6459.Google ScholarCross Ref
    63. Aggeliki Tsoli, Naureen Mahmood, and Michael J Black. 2014. Breathing life into shape: Capturing, modeling and animating 3D human breathing. ACM Transactions on graphics (TOG) 33, 4 (2014), 1–11.Google ScholarDigital Library
    64. Dimitrios Tzionas, Luca Ballan, Abhilash Srikantha, Pablo Aponte, Marc Pollefeys, and Juergen Gall. 2016. Capturing hands in action using discriminative salient points and physics simulation. International Journal of Computer Vision 118, 2 (2016), 172–193.Google ScholarDigital Library
    65. Bohan Wang, George Matcuk, and Jernej Barbič. 2019. Hand modeling and simulation using stabilized magnetic resonance imaging. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1–14.Google ScholarDigital Library
    66. Bohan Wang, George Matcuk, and Jernej Barbič. 2021. Modeling of Personalized Anatomy using Plastic Strains. ACM Transactions on Graphics (TOG) 40, 2 (2021), 1–21.Google ScholarDigital Library
    67. Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. 2018. Pixel2mesh: Generating 3d mesh models from single rgb images. In Proceedings of the European Conference on Computer Vision (ECCV). 52–67.Google ScholarDigital Library
    68. Lan Xu, Wei Cheng, Kaiwen Guo, Lei Han, Yebin Liu, and Lu Fang. 2019. Flyfusion: Real-time dynamic scene reconstruction using a flying depth camera. IEEE Transactions on Visualization and Computer Graphics (2019).Google Scholar
    69. Weipeng Xu, Avishek Chatterjee, Michael Zollhöfer, Helge Rhodin, Dushyant Mehta, Hans-Peter Seidel, and Christian Theobalt. 2018. MonoPerfCap: Human Performance Capture From Monocular Video. ACM Trans. Graph. 37, 2, Article 27 (May 2018), 15 pages. Google ScholarDigital Library
    70. Christian Zimmermann, Duygu Ceylan, Jimei Yang, Bryan Russell, Max Argus, and Thomas Brox. 2019. Freihand: A dataset for markerless capture of hand pose and shape from single rgb images. In Proceedings of the IEEE International Conference on Computer Vision. 813–822.Google ScholarCross Ref
    71. Michael Zollhöfer, Justus Thies, Pablo Garrido, Derek Bradley, Thabo Beeler, Patrick Pérez, Marc Stamminger, Matthias Nießner, and Christian Theobalt. 2018. State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications. Computer Graphics Forum 37 (2018).Google Scholar

ACM Digital Library Publication:

Overview Page: