“MonoPerfCap: Human Performance Capture From Monocular Video” by Xu, Chatterjee, Zollhöfer, Rhodin, Mehta, et al. …

  • ©

Conference:


Type(s):


Title:

    MonoPerfCap: Human Performance Capture From Monocular Video

Session/Category Title:   Bodies in Motion Human Performance Capture


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    We present the first marker-less approach for temporally coherent 3D performance capture of a human with general clothing from monocular video. Our approach reconstructs articulated human skeleton motion as well as medium-scale non-rigid surface deformations in general scenes. Human performance capture is a challenging problem due to the large range of articulation, potentially fast motion, and considerable non-rigid deformations, even from multi-view data. Reconstruction from monocular video alone is drastically more challenging, since strong occlusions and the inherent depth ambiguity lead to a highly ill-posed reconstruction problem. We tackle these challenges by a novel approach that employs sparse 2D and 3D human pose detections from a convolutional neural network using a batch-based pose estimation strategy. Joint recovery of per-batch motion allows us to resolve the ambiguities of the monocular reconstruction problem based on a low-dimensional trajectory subspace. In addition, we propose refinement of the surface geometry based on fully automatically extracted silhouettes to enable medium-scale non-rigid alignment. We demonstrate state-of-the-art performance capture results that enable exciting applications such as video editing and free viewpoint video, previously infeasible from monocular video. Our qualitative and quantitative evaluation demonstrates that our approach significantly outperforms previous monocular methods in terms of accuracy, robustness, and scene complexity that can be handled.

References:


    1. Ijaz Akhter and Michael J. Black. 2015. Pose-conditioned joint angle limits for 3D human pose reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 1446–1455.Google Scholar
    2. Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele. 2014. 2D human pose estimation: New benchmark and state of the art analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). Google ScholarDigital Library
    3. Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, and James Davis. 2005. SCAPE: Shape completion and animation of people. ACM Trans. Graph. 24, 3, 408–416. Google ScholarDigital Library
    4. Alexandru O. Balan, Leonid Sigal, Michael J. Black, James E. Davis, and Horst W. Haussecker. 2007. Detailed human shape and pose from images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’07). 1–8.Google Scholar
    5. A. Bartoli, Y. Gérard, F. Chadebecq, T. Collins, and D. Pizarro. 2015. Shape-from-template. IEEE Trans. Pattern Anal. Mach. Intell. 37, 10, 2099–2118. Google ScholarDigital Library
    6. Federica Bogo, Michael J. Black, Matthew Loper, and Javier Romero. 2015. Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In Proceedings of the International Conference on Computer Vision (ICCV’15). 2300–2308. Google ScholarDigital Library
    7. Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J. Black. 2016. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In Proceedings of the European Conference on Computer Vision (ECCV’16).Google Scholar
    8. Derek Bradley, Tiberiu Popa, Alla Sheffer, Wolfgang Heidrich, and Tamy Boubekeur. 2008. Markerless garment capture. ACM Trans. Graph. 27, 99. Google ScholarDigital Library
    9. Matthieu Bray, Pushmeet Kohli, and Philip H. S. Torr. 2006. Posecut: Simultaneous segmentation and 3D pose estimation of humans using dynamic graph-cuts. In Proceedings of the European Conference on Computer Vision (ECCV’06). Springer, 642–655. Google ScholarDigital Library
    10. Thomas Brox, Bodo Rosenhahn, Daniel Cremers, and Hans-Peter Seidel. 2006. High-accuracy optical flow serves 3D pose tracking: Exploiting contour and flow-based constraints. In Proceedings of the European Conference on Computer Vision (ECCV’06). Springer, 98–111. Google ScholarDigital Library
    11. Thomas Brox, Bodo Rosenhahn, Juergen Gall, and Daniel Cremers. 2010. Combined region and motion-based 3D tracking of rigid and articulated objects. IEEE Trans. Pattern Anal. Mach. Intell. 32, 3, 402–415. Google ScholarDigital Library
    12. Cedric Cagniart, Edmond Boyer, and Slobodan Ilic. 2010. Free-form mesh tracking: A patch-based approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10). IEEE, Los Alamitos, CA, 1339–1346.Google ScholarCross Ref
    13. Joel Carranza, Christian Theobalt, Marcus A. Magnor, and Hans-Peter Seidel. 2003. Free-viewpoint video of human actors. ACM Trans. Graph. 22, 3, 569–577. Google ScholarDigital Library
    14. Yu Chen, Tae-Kyun Kim, and Roberto Cipolla. 2010. Inferring 3D shapes and deformations from single views. In Proceedings of the European Conference on Computer Vision (ECCV’10). 300–313. Google ScholarDigital Library
    15. Alvaro Collet, Ming Chuang, Pat Sweeney, Don Gillett, Dennis Evseev, David Calabrese, Hugues Hoppe, Adam Kirk, and Steve Sullivan. 2015. High-quality streamable free-viewpoint video. ACM Trans. Graph. 34, 4, 69. Google ScholarDigital Library
    16. Edilson De Aguiar, Carsten Stoll, Christian Theobalt, Naveed Ahmed, Hans-Peter Seidel, and Sebastian Thrun. 2008. Performance capture from sparse multi-view video. In ACM Trans. Graph. 27, 98. Google ScholarDigital Library
    17. Mingsong Dou, Henry Fuchs, and Jan-Michael Frahm. 2013. Scanning and tracking dynamic objects with commodity depth cameras. In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality (ISMAR’13). IEEE, Los Alamitos, CA, 99–106.Google Scholar
    18. Mingsong Dou, Sameh Khamis, Yury Degtyarev, Philip Davidson, Sean Ryan Fanello, Adarsh Kowdle, Sergio Orts Escolano, Christoph Rhemann, David Kim, Jonathan Taylor, and others. 2016. Fusion4D: Real-time performance capture of challenging scenes. ACM Trans. Graph. 35, 4, 114. Google ScholarDigital Library
    19. Ahmed Elhayek, Edilson de Aguiar, Arjun Jain, Jonathan Tompson, Leonid Pishchulin, Micha Andriluka, Chris Bregler, Bernt Schiele, and Christian Theobalt. 2015. Efficient ConvNet-based marker-less motion capture in general scenes with a low number of cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 3810–3818.Google ScholarCross Ref
    20. Juergen Gall, Carsten Stoll, Edilson De Aguiar, Christian Theobalt, Bodo Rosenhahn, and Hans-Peter Seidel. 2009. Motion capture using joint skeleton tracking and surface estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, Los Alamitos, CA, 1746–1753.Google ScholarCross Ref
    21. R. Garg, A. Roussos, and L. Agapito. 2013. Dense variational reconstruction of non-rigid surfaces from monocular video. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. 1272–1279. Google ScholarDigital Library
    22. Pablo Garrido, Michael Zollhoefer, Dan Casas, Levi Valgaerts, Kiran Varanasi, Patrick Perez, and Christian Theobalt. 2016. Reconstruction of personalized 3D face rigs from monocular video. ACM Trans. Graph. 35, 3 28:1–28:15. Google ScholarDigital Library
    23. Daniel Grest, Dennis Herzog, and Reinhard Koch. 2005. Human model fitting from monocular posture images. In Proceedings of the Conference on Vision, Modeling and Visualization (VMV’05).Google Scholar
    24. Peng Guan, Alexander Weiss, Alexandru O Bălan, and Michael J Black. 2009. Estimating human shape and pose from a single image. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’09). 1381–1388.Google Scholar
    25. Kaiwen Guo, Feng Xu, Yangang Wang, Yebin Liu, and Qionghai Dai. 2015. Robust non-rigid motion tracking and surface reconstruction using L0 regularization. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV’15). 3083–3091. Google ScholarDigital Library
    26. Nils Hasler, Hanno Ackermann, Bodo Rosenhahn, Thorsten Thormählen, and Hans-Peter Seidel. 2010. Multilinear pose and body shape estimation of dressed subjects from image sets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10). IEEE, Los Alamitos, CA, 1823–1830.Google ScholarCross Ref
    27. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the EEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google ScholarCross Ref
    28. Thomas Helten, Meinard Muller, Hans-Peter Seidel, and Christian Theobalt. 2013. Real-time body tracking with one depth camera and inertial sensors. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’13). Google ScholarDigital Library
    29. Yinghao Huang, Federica Bogo, Christoph Lassner, Angjoo Kanazawa, Peter V. Gehler, Javier Romero, Ijaz Akhter, and Michael J. Black. 2017. Towards accurate marker-less human shape and pose estimation over time. In Proceedings of the International Conference on 3D Vision (3DV’17).Google Scholar
    30. Matthias Innmann, Michael Zollhöfer, Matthias Nießner, Christian Theobalt, and Marc Stamminger. 2016. VolumeDeform: Real-time volumetric non-rigid reconstruction. In Computer Vision—ECCV 2016. Springer, 17.Google ScholarCross Ref
    31. Catalin Ionescu, Joao Carreira, and Cristian Sminchisescu. 2014a. Iterated second-order label sensitive pooling for 3D human pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 1661–1668. Google ScholarDigital Library
    32. Catalin Ionescu, Dragos Papava, Vlad Olaru, and Cristian Sminchisescu. 2014b. Human3.6M: Large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36, 7, 1325–1339. Google ScholarDigital Library
    33. Arjun Jain, Thorsten Thormählen, Hans-Peter Seidel, and Christian Theobalt. 2010. MovieReshape: Tracking and reshaping of humans in videos. ACM Trans. Graph. 29, 5, Article 148. Google ScholarDigital Library
    34. Arjun Jain, Jonathan Tompson, Yann LeCun, and Christoph Bregler. 2014. Modeep: A deep learning framework using motion features for human pose estimation. In Proceedings of the Asian Conference on Computer Vision (ACCV’14). 302–315.Google Scholar
    35. Sam Johnson and Mark Everingham. 2011. Learning effective human pose estimation from inaccurate annotation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
    36. Ladislav Kavan, Steven Collins, Jiří Žára, and Carol O’Sullivan. 2007. Skinning with dual quaternions. In Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games (I3D’07). Google ScholarDigital Library
    37. J. P. Lewis, Matt Cordner, and Nickson Fong. 2000. Pose Space Deformation: A unified approach to shape interpolation and skeleton-driven deformation. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’00). 165–172. Google ScholarDigital Library
    38. Hao Li, Bart Adams, Leonidas J. Guibas, and Mark Pauly. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graph. 28, 5, Article 175. Google ScholarDigital Library
    39. Sijin Li and Antoni B Chan. 2014. 3D human pose estimation from monocular images with deep convolutional neural network. In Proceedings of the Asian Conference on Computer Vision (ACCV’14). 332–347.Google Scholar
    40. Sijin Li, Weichen Zhang, and Antoni B Chan. 2015. Maximum-margin structured learning with deep networks for 3D human pose estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’15). 2848–2856. Google ScholarDigital Library
    41. Yebin Liu, Carsten Stoll, Juergen Gall, Hans-Peter Seidel, and Christian Theobalt. 2011. Markerless motion capture of interacting characters using multi-view image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE, Los Alamitos, CA, 1249–1256. Google ScholarDigital Library
    42. Matthew Loper, Naureen Mahmood, and Michael J. Black. 2014. MoSh: Motion and shape capture from sparse markers. ACM Trans. Graph. 33, 6, 220. Google ScholarDigital Library
    43. Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A skinned multi-person linear model. ACM Trans. Graph. 34, 6, Article 248. Google ScholarDigital Library
    44. Wojciech Matusik, Chris Buehler, Ramesh Raskar, Steven J. Gortler, and Leonard McMillan. 2000. Image-based visual hulls. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. 369–374. Google ScholarDigital Library
    45. Dushyant Mehta, Helge Rhodin, Dan Casas, Oleksandr Sotnychenko, Weipeng Xu, and Christian Theobalt. 2016. Monocular 3D human pose estimation using transfer learning and improved CNN supervision. arXiv:1611.09813.Google Scholar
    46. Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, and Christian Theobalt. 2017. VNect: Real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. 36, 4, 14. Google ScholarDigital Library
    47. Greg Mori and Jitendra Malik. 2006. Recovering 3D human body configurations using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7, 1052–1062. Google ScholarDigital Library
    48. Armin Mustafa, Hansung Kim, Jean-Yves Guillemaut, and Adrian Hilton. 2015. General dynamic scene reconstruction from multiple view video. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’15). Google ScholarDigital Library
    49. Richard A. Newcombe, Dieter Fox, and Steven M. Seitz. 2015. DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15).Google Scholar
    50. Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. arXiv:1603.06937.Google Scholar
    51. Hyun Soo Park, Takaaki Shiratori, Iain Matthews, and Yaser Sheikh. 2015. 3D trajectory reconstruction under perspective projection. Int. J. Comput. Vision 115, 2, 115–135. Google ScholarDigital Library
    52. Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, and Kostas Daniilidis. 2016. Coarse-to-fine volumetric prediction for single-image 3D human pose. arXiv:1611.07828.Google Scholar
    53. Leonid Pishchulin, Eldar Insafutdinov, Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, Peter Gehler, and Bernt Schiele. 2016. DeepCut: Joint subset partition and labeling for multi person pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google ScholarCross Ref
    54. Ralf Plänkers and Pascal Fua. 2001. Tracking and modeling people in video sequences. Comput. Vision Image Understand. 81, 3, 285–302. Google ScholarDigital Library
    55. Helge Rhodin, Nadia Robertini, Dan Casas, Christian Richardt, Hans-Peter Seidel, and Christian Theobalt. 2016. General automatic human shape and motion capture using volumetric contour cues. In Proceedings of the European Conference on Computer Vision (ECCV’16). 509–526.Google ScholarCross Ref
    56. Nadia Robertini, Dan Casas, Helge Rhodin, Hans-Peter Seidel, and Christian Theobalt. 2016. Model-based outdoor performance capture. In Proceedings of the International Conference on Computer Vision (3DV’16).Google ScholarCross Ref
    57. Lorenz Rogge, Felix Klose, Michael Stengel, Martin Eisemann, and Marcus Magnor. 2014. Garment replacement in monocular video sequences. ACM Trans. Graph. 34, 1, 6. Google ScholarDigital Library
    58. Rómer Rosales and Stan Sclaroff. 2006. Combining generative and discriminative models in a framework for articulated pose estimation. Int. J. Comput. Vis. 67, 3, 251–276. Google ScholarDigital Library
    59. Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. 2004. GrabCut: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 309–314. Google ScholarDigital Library
    60. Chris Russell, Rui Yu, and Lourdes Agapito. 2014. Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes. Springer International Publishing, Cham, 583–598.Google Scholar
    61. Mathieu Salzmann and Pascal Fua. 2011. Linear local models for monocular reconstruction of deformable surfaces. IEEE Trans. Pattern Anal. Mach. Intell. 33, 5, 931–944. Google ScholarDigital Library
    62. J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. 2011. Real-time human pose recognition in parts from single depth images. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 1297–1304. Google ScholarDigital Library
    63. Hedvig Sidenbladh, Michael J. Black, and David J. Fleet. 2000. Stochastic tracking of 3D human figures using 2D image motion. In Proceedings of the European Conference on Computer Vision (ECCV’00). 702–718. Google ScholarDigital Library
    64. Leonid Sigal, Alexandru Balan, and Michael J. Black. 2007. Combined discriminative and generative articulated pose and non-rigid shape estimation. In Advances in Neural Information Processing Systems. MIT Press, Cambridge, MA, 1337–1344. Google ScholarDigital Library
    65. Edgar Simo-Serra, Arnau Ramisa, Guillem Alenyà, Carme Torras, and Francesc Moreno-Noguer. 2012. Single image 3D human pose estimation from noisy observations. In Proceedings of the EEE Conference on Computer Vision and Pattern Recognition (CVPR’12). IEEE, Los Alamitos, CA, 2673–2680. Google ScholarDigital Library
    66. Cristian Sminchisescu, Atul Kanaujia, and Dimitris Metaxas. 2006. Learning joint top-down and bottom-up processes for 3D visual inference. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 2. IEEE, Los Alamitos, CA, 1743–1752. Google ScholarDigital Library
    67. Cristian Sminchisescu and Bill Triggs. 2003a. Estimating articulated human motion with covariance scaled sampling. Int. J. Robot. Res. 22, 6, 371–391.Google ScholarCross Ref
    68. Cristian Sminchisescu and Bill Triggs. 2003b. Kinematic jump processes for monocular 3D human tracking. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’03), Vol. 1. IEEE, Los Alamitos, CA, I–69. Google ScholarDigital Library
    69. Dan Song, Ruofeng Tong, Jian Chang, Xiaosong Yang, Min Tang, and Jian Jun Zhang. 2016. 3D body shapes estimation from dressed-human silhouettes. In Computer Graphics Forum, Vol. 35. Wiley Online Library, 147–156. Google ScholarDigital Library
    70. Olga Sorkine and Marc Alexa. 2007. As-rigid-as-possible surface modeling. In Proceedings of the 5th Eurographics Symposium on Geometry Processing (SGP’07). Google ScholarDigital Library
    71. Jonathan Starck and Adrian Hilton. 2007. Surface capture for performance-based animation. IEEE Comput. Graph. Appl. 27, 3, 21–31. Google ScholarDigital Library
    72. Carsten Stoll, Nils Hasler, Juergen Gall, Hans-Peter Seidel, and Christian Theobalt. 2011. Fast articulated motion tracking using a sums of Gaussians body model. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’11). 951–958. Google ScholarDigital Library
    73. Robert W. Sumner, Johannes Schmid, and Mark Pauly. 2007. Embedded deformation for shape manipulation. ACM Trans. Graph. 26, 3, 80. Google ScholarDigital Library
    74. Camillo J. Taylor. 2000. Reconstruction of articulated objects from point correspondences in a single uncalibrated image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’00), Vol. 1. 677–684.Google ScholarCross Ref
    75. Bugra Tekin, Isinsu Katircioglu, Mathieu Salzmann, Vincent Lepetit, and Pascal Fua. 2016. Structured prediction of 3D human pose with deep neural networks. In Proceedings of the British Machine Vision Conference (BMVC’16).Google ScholarCross Ref
    76. J. Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, and M. Nießner. 2016. Face2Face: Real-time face capture and reenactment of RGB videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE, Los Alamitos, CA.Google Scholar
    77. Alexander Toshev and Christian Szegedy. 2014. Deeppose: Human pose estimation via deep neural networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’14). 1653–1660. Google ScholarDigital Library
    78. Raquel Urtasun, David J. Fleet, and Pascal Fua. 2005. Monocular 3D tracking of the golf swing. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’05). 932–938. Google ScholarDigital Library
    79. Raquel Urtasun, David J. Fleet, and Pascal Fua. 2006. Temporal motion models for monocular and multiview 3D human body tracking. Comput. Vision Image Understand. 104, 2, 157–177. Google ScholarDigital Library
    80. Daniel Vlasic, Ilya Baran, Wojciech Matusik, and Jovan Popović. 2008. Articulated mesh animation from multi-view silhouettes. ACM Trans. Graph. 27, 97. Google ScholarDigital Library
    81. Daniel Vlasic, Pieter Peers, Ilya Baran, Paul Debevec, Jovan Popović, Szymon Rusinkiewicz, and Wojciech Matusik. 2009. Dynamic shape capture using multi-view photometric stereo. ACM Trans. Graph. 28, 5, 174. Google ScholarDigital Library
    82. Chunyu Wang, Yizhou Wang, Zhouchen Lin, Alan L. Yuille, and Wen Gao. 2014. Robust estimation of 3D human poses from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 2361–2368. Google ScholarDigital Library
    83. Ruizhe Wang, Lingyu Wei, Etienne Vouga, Qixing Huang, Duygu Ceylan, Gerard Medioni, and Hao Li. 2016. Capturing dynamic textured surfaces of moving targets. In Proceedings of the European Conference on Computer Vision (ECCV’16).Google ScholarCross Ref
    84. Michael Waschbüsch, Stephan Würmlin, Daniel Cotting, Filip Sadlo, and Markus Gross. 2005. Scalable 3D video of dynamic scenes. Visual Comput. 21, 8–10, 629–638.Google ScholarCross Ref
    85. Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional pose machines. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’16).Google ScholarCross Ref
    86. Xiaolin Wei and Jinxiang Chai. 2010. Videomocap: Modeling physically realistic human motion from monocular video sequences. ACM Trans. Graph. 29, 42. Google ScholarDigital Library
    87. Christopher Richard Wren, Ali Azarbayejani, Trevor Darrell, and Alex Paul Pentland. 1997. Pfinder: Real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intell. 19, 7, 780–785. Google ScholarDigital Library
    88. Chenglei Wu, Carsten Stoll, Levi Valgaerts, and Christian Theobalt. 2013. On-set performance capture of multiple actors with a stereo camera. ACM Trans. Graph. 32, 161:1–161:11. Google ScholarDigital Library
    89. Chenglei Wu, Kiran Varanasi, and Christian Theobalt. 2012. Full body performance capture under uncontrolled and varying illumination: A shading-based approach. In Proceedings of the European Conference on Computer Vision (ECCV’12). 757–770. Google ScholarDigital Library
    90. Weipeng Xu, Mathieu Salzmann, Yongtian Wang, and Yue Liu. 2015. Deformable 3D fusion: From partial dynamic 3D observations to complete 4D models. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV’15). 2183–2191. Google ScholarDigital Library
    91. Hashim Yasin, Umar Iqbal, Björn Krüger, Andreas Weber, and Juergen Gall. 2016. A dual-source approach for 3D pose estimation from a single image. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’16).Google ScholarCross Ref
    92. Genzhi Ye, Yebin Liu, Nils Hasler, Xiangyang Ji, Qionghai Dai, and Christian Theobalt. 2012. Performance capture of interacting characters with handheld kinects. In Proceedings of the European Conference on Computer Vision (ECCV’12), Vol. 7573 LNCS. 828–841.Google ScholarCross Ref
    93. Rui Yu, Chris Russell, Neill D. F. Campbell, and Lourdes Agapito. 2015. Direct, dense, and deformable: Template-based non-rigid 3D reconstruction from RGB video. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’15). Google ScholarDigital Library
    94. Qing Zhang, Bo Fu, Mao Ye, and Ruigang Yang. 2014. Quality dynamic human body modeling using a single low-cost depth camera. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). IEEE, 676–683. Google ScholarDigital Library
    95. Shizhe Zhou, Hongbo Fu, Ligang Liu, Daniel Cohen-Or, and Xiaoguang Han. 2010. Parametric reshaping of human bodies in images. ACM Trans. Graph. (TOG) 29, 4 (2010), 126. Google ScholarDigital Library
    96. Xiaowei Zhou, Spyridon Leonardos, Xiaoyan Hu, and Kostas Daniilidis. 2015. 3D shape estimation from 2D landmarks: A convex relaxation approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 4447–4455.Google ScholarCross Ref
    97. Xingyi Zhou, Xiao Sun, Wei Zhang, Shuang Liang, and Yichen Wei. 2016a. Deep kinematic pose regression. arXiv Preprint arXiv:1609.05317 (2016).Google Scholar
    98. Xiaowei Zhou, Menglong Zhu, Spyridon Leonardos, Konstantinos G. Derpanis, and Kostas Daniilidis. 2016b. Sparseness meets deepness: 3D human pose estimation from monocular video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4966–4975.Google ScholarCross Ref
    99. Michael Zollhöfer, Matthias Nießner, Shahram Izadi, Christoph Rhemann, Christopher Zach, Matthew Fisher, Chenglei Wu, Andrew Fitzgibbon, Charles Loop, Christian Theobalt, and Marc Stamminger. 2014. Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. 33, 4, Article 156. Google ScholarDigital Library

ACM Digital Library Publication:



Overview Page: