“ClothCap: seamless 4D clothing capture and retargeting” by Pons-Moll, Pujades, Hu and Black

  • ©Gerard Pons-Moll, Sergi Pujades, Sonny Hu, and Michael J. Black




    ClothCap: seamless 4D clothing capture and retargeting



    Designing and simulating realistic clothing is challenging. Previous methods addressing the capture of clothing from 3D scans have been limited to single garments and simple motions, lack detail, or require specialized texture patterns. Here we address the problem of capturing regular clothing on fully dressed people in motion. People typically wear multiple pieces of clothing at a time. To estimate the shape of such clothing, track it over time, and render it believably, each garment must be segmented from the others and the body. Our ClothCap approach uses a new multi-part 3D model of clothed bodies, automatically segments each piece of clothing, estimates the minimally clothed body shape and pose under the clothing, and tracks the 3D deformations of the clothing over time. We estimate the garments and their motion from 4D scans; that is, high-resolution 3D scans of the subject in motion at 60 fps. ClothCap is able to capture a clothed person in motion, extract their clothing, and retarget the clothing to new body shapes; this provides a step towards virtual try-on.


    1. A. Balan and M. J. Black. 2008. The naked truth: Estimating body shape under clothing,. In European Conf. on Computer Vision, ECCV (LNCS), Vol. 5304. Springer-Verlag, Marseilles, France, 15–29.Google Scholar
    2. Kiran S. Bhat, Christopher D. Twigg, Jessica K. Hodgins, Pradeep K. Khosla, Zoran Popović, and Steven M. Seitz. 2003. Estimating Cloth Simulation Parameters from Video. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA ’03). Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, 37–51. http://dl.acm.org/citation.cfm?id=846276.846282Google ScholarDigital Library
    3. Federica Bogo, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2017. Dynamic FAUST: Registering Human Bodies in Motion. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).Google Scholar
    4. K.L. Bouman, B. Xiao, P. Bataglia, and W.T. Freeman. 2013. Estimating the Material Properties of Fabrics from Videos. In International Conference in Computer Vision (ICCV). 1984–1991. Google ScholarDigital Library
    5. Derek Bradley, Tiberiu Popa, Alla Sheffer, Wolfgang Heidrich, and Tamy Boubekeur. 2008. Markerless Garment Capture. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3 (2008), 99.Google ScholarDigital Library
    6. Rémi Brouet, Alla Sheffer, Laurence Boissieux, and Marie-Paule Cani. 2012. Design Preserving Garment Transfer. ACM Transactions on Graphics (Aug. 2012). http://hal.inria.fr/hal-00695903Google Scholar
    7. Dan Casas, Marco Volino, John Collomosse, and Adrian Hilton. 2014. 4d video textures for interactive character appearance. Computer Graphics Forum 33, 2 (2014), 371–380. Google ScholarDigital Library
    8. Xiaowu Chen, Bin Zhou, Feixiang Lu, Lin Wang, Lang Bi, and Ping Tan. 2015. Garment Modeling with a Depth Camera. ACM Trans. Graph. 34, 6, Article 203 (Oct. 2015), 12 pages. Google ScholarDigital Library
    9. Alvaro Collet, Ming Chuang, Pat Sweeney, Don Gillett, Dennis Evseev, David Calabrese, Hugues Hoppe, Adam Kirk, and Steve Sullivan. 2015. High-Quality Streamable Free-Viewpoint Video. ACM Transactions on Graphics (SIGGRAPH) 34, 4 (2015). Google ScholarDigital Library
    10. Radek Danecek, Endri Dibra, A. Cengiz Öztireli, Remo Ziegler, and Markus Gross. 2017. DeepGarment : 3D Garment Shape Estimation from a Single Image. Computer Graphics Forum 36(2), Proceedings of the 38th Annual Conference of the European Association for Computer Graphics (Eurographics) (2017).Google Scholar
    11. Matthias Dantone, Juergen Gall, Christian Leistner, and Luc Van Gool. 2014. Body parts dependent joint regressors for human pose estimation in still images. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 11 (2014), 2131–2143. Google ScholarCross Ref
    12. Edilson de Aguiar, Leonid Sigal, Adrien Treuille, and Jessica K. Hodgins. 2010. Stable Spaces for Real-time Clothing. ACM Trans. Graph. 29, 4, Article 106 (July 2010), 9 pages. Google ScholarDigital Library
    13. Edilson de Aguiar, Carsten Stoll, Christian Theobalt, Naveed Ahmed, Hans-Peter Seidel, and Sebastian Thrun. 2008. Performance Capture from Sparse Multi-view Video. ACM Trans. Graph. 27, 3, Article 98 (Aug. 2008), 10 pages. Google ScholarDigital Library
    14. Mingsong Dou, Sameh Khamis, Yury Degtyarev, Philip Davidson, Sean Ryan Fanello, Adarsh Kowdle, Sergio Orts Escolano, Christoph Rhemann, David Kim, Jonathan Taylor, and others. 2016. Fusion4d: Real-time performance capture of challenging scenes. ACM Transactions on Graphics (TOG) 35, 4 (2016), 114.Google ScholarDigital Library
    15. Juergen Gall, Carsten Stoll, Edilson De Aguiar, Christian Theobalt, Bodo Rosenhahn, and Hans-Peter Seidel. 2009. Motion capture using joint skeleton tracking and surface estimation. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 1746–1753. Google ScholarCross Ref
    16. Russell Gillette, Craig Peters, Nicholas Vining, Essex Edwards, and Alla Sheffer. 2015. Real-Time Dynamic Wrinkling of Coarse Animated Cloth. In Proc. Symposium on Computer Animation. Google ScholarDigital Library
    17. Rony Goldenthal, David Harmon, Raanan Fattal, Michel Bercovier, and Eitan Grinspun. 2007. Efficient Simulation of Inextensible Cloth. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2007) 26, 3 (2007).Google Scholar
    18. P. Guan, L. Reiss, D. Hirshberg, A. Weiss, and M. J. Black. 2012. DRAPE: DRessing Any PErson. ACM Trans. on Graphics (Proc. SIGGRAPH) 31, 4 (July 2012), 35:1–35:10.Google ScholarDigital Library
    19. Nils Hasler, Carsten Stoll, Bodo Rosenhahn, Thorsten ThormÃd’hlen, and Hans-Peter Seidel. 2009. Estimating body shape of dressed humans. Computers & Graphics 33, 3 (2009), 211 — 216. {IEEE} International Conference on Shape Modelling and Applications 2009. Google ScholarDigital Library
    20. Anna Hilsmann and Peter Eisert. 2009. Tracking and Retexturing Cloth for Real-Time Virtual Clothing Applications. In Proceedings of the 4th International Conference on Computer Vision/Computer Graphics CollaborationTechniques (MIRAGE ’09). Springer-Verlag, Berlin, Heidelberg, 94–105. Google ScholarDigital Library
    21. Peng Huang, Margara Tejera, John Collomosse, and Adrian Hilton. 2015. Hybrid skeletal-surface motion graphs for character animation from 4d performance capture. ACM Transactions on Graphics (TOG) 34, 2 (2015), 17.Google ScholarDigital Library
    22. Matthias Innmann, Michael Zollhöfer, Matthias Nießner, Christian Theobalt, and Marc Stamminger. 2016. VolumeDeform: Real-time Volumetric Non-rigid Reconstruction. In Proceedings of European Conference on Computer Vision (ECCV). 17. Google ScholarCross Ref
    23. Arjun Jain, Thorsten Thormählen, Hans-Peter Seidel, and Christian Theobalt. 2010. MovieReshape: Tracking and Reshaping of Humans in Videos. ACM Trans. Graph. 29, 6, Article 148 (Dec. 2010), 10 pages. Google ScholarDigital Library
    24. Sam Johnson and Mark Everingham. 2010. Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation. In Proceedings of the British Machine Vision Conference. Google ScholarCross Ref
    25. Ladislav Kavan, Dan Gerszewski, Adam W. Bargteil, and Peter-Pike Sloan. 2011. Physics-inspired Upsampling for Cloth Simulation in Games. ACM Trans. Graph. 30, 4, Article 93 (July 2011), 10 pages. Google ScholarDigital Library
    26. Doyub Kim, Woojong Koh, Rahul Narain, Kayvon Fatahalian, Adrien Treuille, and James F. O’Brien. 2013. Near-exhaustive Precomputation of Secondary Cloth Effects. ACM Transactions on Graphics 32, 4 (July 2013), 87:1–7. http://graphics.berkeley.edu/papers/Kim-NEP-2013-07/ Proceedings of ACM SIGGRAPH 2013, Anaheim.Google ScholarDigital Library
    27. Meekyoung Kim, Gerard Pons-Moll, Sergi Pujades, Sungbae Bang, Jinwwok Kim, Michael Black, and Sung-Hee Lee. 2017. Data-Driven Physics for Human Soft Tissue Animation. ACM Transactions on Graphics, (Proc. SIGGRAPH) 36, 4 (2017). Google ScholarDigital Library
    28. Christoph Lassner, Javier Romero, Martin Kiefel, Federica Bogo, Michael J. Black, and Peter V. Gehler. 2017. Unite the People: Closing the Loop Between 3D and 2D Human Representations. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). http://files.is.tuebingen.mpg.de/classner/upGoogle Scholar
    29. X. Liang, C. Xu, X. Shen, J. Yang, J. Tang, L. Lin, and S. Yan. 2015. Human Parsing with Contextualized Convolutional Neural Network. In Int. Conf. Comp. Vis. (ICCV). Google ScholarDigital Library
    30. Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 34, 6 (Oct. 2015), 248:1–248:16. Google ScholarDigital Library
    31. E. Miguel, D. Bradley, B. Thomaszewski, B. Bickel, W. Matusik, M. A. Otaduy, and S. Marschner. 2012. Data-Driven Estimation of Cloth Simulation Models. Comput. Graph. Forum 31, 2pt2 (May 2012), 519–528. Google ScholarDigital Library
    32. Alexandros Neophytou and Adrian Hilton. 2014. A layered model of human body and garment deformation. In 2014 2nd International Conference on 3D Vision, Vol. 1. IEEE, 171–178. Google ScholarDigital Library
    33. Richard A Newcombe, Dieter Fox, and Steven M Seitz. 2015. Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In Proceedings of the IEEE conference on computer vision and pattern recognition. 343–352.Google ScholarCross Ref
    34. Gerard Pons-Moll, Javier Romero, Naureen Mahmood, and Michael J. Black. 2015a. Dyna: A Model of Dynamic Human Shape in Motion. ACM Transactions on Graphics, (Proc. SIGGRAPH) 34, 4 (July 2015), 120:1–120:14. Google ScholarDigital Library
    35. Gerard Pons-Moll, Jonathan Taylor, Jamie Shotton, Aaron Hertzmann, and Andrew Fitzgibbon. 2015b. Metric Regression Forests for Correspondence Estimation. International Journal of Computer Vision (2015), 1–13.Google Scholar
    36. Tiberiu Popa, Qingnan Zhou, Derek Bradley, Vladislav Kraevoy, Hongbo Fu, Alla Sheffer, and Wolfgang Heidrich. 2009. Wrinkling Captured Garments Using Space-Time Data-Driven Deformation. Computer Graphics Forum (Proc. Eurographics) 28, 2 (2009), 427–435.Google ScholarCross Ref
    37. Nadia Robertini, Edilson De Aguiar, Thomas Helten, and Christian Theobalt. 2014. Efficient Multi-view Performance Capture of Fine-Scale Surface Detail. In Proceedings of the 2014 2Nd International Conference on 3D Vision – Volume 01 (3DV ’14). IEEE Computer Society, Washington, DC, USA, 5–12. Google ScholarDigital Library
    38. K. Robinette, S. Blackwell, H. Daanen, M. Boehmer, S. Fleming, T. Brill, D. Hoeferlin, and D. Burnsides. 2002. Civilian American and European Surface Anthropometry Resource (CAESAR) Final Report. Technical Report AFRL-HE-WP-TR-2002-0169. US Air Force Research Laboratory.Google Scholar
    39. Lorenz Rogge, Felix Klose, Michael Stengel, Martin Eisemann, and Marcus Magnor. 2014. Garment Replacement in Monocular Video Sequences. ACM Transactions on Graphics 34, 1 (Nov. 2014), 6:1–6:10.Google ScholarDigital Library
    40. Bodo Rosenhahn, Uwe Kersting, Katie Powell, Reinhard Klette, Gisela Klette, and Hans-Peter Seidel. 2007. A system for articulated tracking incorporating a clothing model. Machine Vision and Applications 18, 1 (2007), 25–40. Google ScholarDigital Library
    41. M. Sekine, K. Sugita, F. Perbet, B. Stenger, and M. Nishiyama. 2014. Virtual Fitting by Single-Shot Body Shape Estimation. In Int. Conf. on 3D Body Scanning Technologies. 406–413. Google ScholarCross Ref
    42. Leonid Sigal, Moshe Mahler, Spencer Diaz, Kyna McIntosh, Elizabeth Carter, Timothy Richards, and Jessica Hodgins. 2015. A Perceptual Control Space for Garment Simulation. ACM Trans. Graph. 34, 4, Article 117 (July 2015), 10 pages. Google ScholarDigital Library
    43. Olga Sorkine. 2006. Differential Representations for Mesh Processing. Computer Graphics Forum 25, 4 (2006), 789–807. Google ScholarCross Ref
    44. Olga Sorkine, Daniel Cohen-Or, Yaron Lipman, Marc Alexa, Christian Rössl, and Hans-Peter Seidel. 2004. Laplacian Surface Editing. In Proceedings of the EUROGRAPHICS/ACM SIGGRAPH Symposium on Geometry Processing. ACM Press, 179–188. Google ScholarDigital Library
    45. Carsten Stoll, Juergen Gall, Edilson de Aguiar, Sebastian Thrun, and Christian Theobalt. 2010. Video-based Reconstruction of Animatable Human Characters. ACM Trans. Graph. 29, 6, Article 139 (Dec. 2010), 10 pages. Google ScholarDigital Library
    46. Stephan Streuber, M. Alejandra Quiros-Ramirez, Matthew Q. Hill, Carina A. Hahn, Silvia Zuffi, Alice OâĂŹToole, and Michael J. Black. 2016. Body Talk: Crowdshaping Realistic 3D Avatars with Words. ACM Trans. Graph. (Proc. SIGGRAPH) 35, 4 (July 2016), 54:1–54:14. Google ScholarDigital Library
    47. D. Sun, E. Sudderth, and M. J. Black. 2010. Layered image motion with explicit occlusions, temporal consistency, and depth ordering. In Advances in Neural Information Processing Systems 23 (NIPS). MIT Press, 2226–2234.Google Scholar
    48. M. Tejera, D. Casas, and A. Hilton. 2013. Animation Control of Surface Motion Capture. Cybernetics, IEEE Transactions on 43, 6 (Dec 2013), 1532–1545. Google ScholarCross Ref
    49. Daniel Vlasic, Ilya Baran, Wojciech Matusik, and Jovan Popović. 2008. Articulated mesh animation from multi-view silhouettes. ACM Transactions on Graphics (TOG) 27, 3 (2008), 97.Google ScholarDigital Library
    50. Timo von Marcard, Gerard Pons-Moll, and Bodo Rosenhahn. 2016. Human Pose Estimation from Video and IMUs. Transactions on Pattern Analysis and Machine Intelligence PAMI (Jan. 2016).Google Scholar
    51. Timo von Marcard, Bodo Rosenhahn, Michael Black, and Gerard Pons-Moll. 2017. Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs. Computer Graphics Forum 36(2), Proceedings of the 38th Annual Conference of the European Association for Computer Graphics (Eurographics) (2017).Google Scholar
    52. Huamin Wang, Florian Hecht, Ravi Ramamoorthi, and James F. O’Brien. 2010. Example-Based Wrinkle Synthesis for Clothing Animation. ACM Transactions on Graphics 29, 4 (July 2010), 107:1–8. http://graphics.berkeley.edu/papers/Wang-EBW-2010-07/ Proceedings of ACM SIGGRAPH 2010, Los Angles, CA.Google ScholarDigital Library
    53. Huamin Wang, James F. O’Brien, and Ravi Ramamoorthi. 2011. Data-Driven Elastic Models for Cloth: Modeling and Measurement. ACM Transactions on Graphics, Proc. SIGGRAPH 30, 4 (July 2011), 71:1–11. Google ScholarDigital Library
    54. Ruizhe Wang, Lingyu Wei, Etienne Vouga, Qixing Huang, Duygu Ceylan, Gerard Medioni, and Hao Li. 2016. Capturing Dynamic Textured Surfaces of Moving Targets. In Proceedings of the European Conference on Computer Vision (ECCV). Google ScholarCross Ref
    55. Ryan White, Keenan Crane, and D. A. Forsyth. 2007. Capturing and Animating Occluded Cloth. ACM Trans. Graph. 26, 3, Article 34 (July 2007). Google ScholarDigital Library
    56. Chenglei Wu, Kiran Varanasi, and Christian Theobalt. 2012. Full Body Performance Capture Under Uncontrolled and Varying Illumination: A Shading-based Approach. In Proceedings of the 12th European Conference on Computer Vision – Volume Part IV (ECCV’12). Springer-Verlag, Berlin, Heidelberg, 757–770. Google ScholarDigital Library
    57. Stefanie Wuhrer, Leonid Pishchulin, Alan Brunton, Chang Shu, and Jochen Lang. 2014. Estimation of human body shape and posture under clothing. Computer Vision and Image Understanding 127 (2014), 31–42. Google ScholarDigital Library
    58. Feng Xu, Yebin Liu, Carsten Stoll, James Tompkin, Gaurav Bharaj, Qionghai Dai, Hans-Peter Seidel, Jan Kautz, and Christian Theobalt. 2011. Video-based Characters: Creating New Human Performances from a Multi-view Video Database. ACM Trans. Graph. 30, 4, Article 32 (July 2011), 10 pages. Google ScholarDigital Library
    59. Jinlong Yang, Jean-Sébastien Franco, Franck Hétroy-Wheeler, and Stefanie Wuhrer. 2016. Estimation of Human Body Shape in Motion with Wide Clothing. In European Conference on Computer Vision 2016. Amsterdam, Netherlands. https://hal.inria.fr/hal-01344795Google Scholar
    60. Chao Zhang, Sergi Pujades, Michael Black, and Gerard Pons-Moll. 2017. Detailed, accurate, human shape estimation from clothed 3D scan sequences. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
    61. Bin Zhou, Xiaowu Chen, Qiang Fu, Kan Guo, and Ping Tan. 2013. Garment Modeling from a Single Image. Computer Graphics Forum (2013). Google ScholarCross Ref

ACM Digital Library Publication: