“3D hair synthesis using volumetric variational autoencoders”
Conference:
Type(s):
Title:
- 3D hair synthesis using volumetric variational autoencoders
Session/Category Title: Modeling things on (and in) your head
Presenter(s)/Author(s):
Moderator(s):
Abstract:
Recent advances in single-view 3D hair digitization have made the creation of high-quality CG characters scalable and accessible to end-users, enabling new forms of personalized VR and gaming experiences. To handle the complexity and variety of hair structures, most cutting-edge techniques rely on the successful retrieval of a particular hair model from a comprehensive hair database. Not only are the aforementioned data-driven methods storage intensive, but they are also prone to failure for highly unconstrained input images, complicated hairstyles, and failed face detection. Instead of using a large collection of 3D hair models directly, we propose to represent the manifold of 3D hairstyles implicitly through a compact latent space of a volumetric variational autoencoder (VAE). This deep neural network is trained with volumetric orientation field representations of 3D hair models and can synthesize new hairstyles from a compressed code. To enable end-to-end 3D hair inference, we train an additional embedding network to predict the code in the VAE latent space from any input image. Strand-level hairstyles can then be generated from the predicted volumetric representation. Our fully automatic framework does not require any ad-hoc face fitting, intermediate classification and segmentation, or hairstyle database retrieval. Our hair synthesis approach is significantly more robust and can handle a much wider variation of hairstyles than state-of-the-art data-driven hair modeling techniques with challenging inputs, including photos that are low-resolution, overexposured, or contain extreme head poses. The storage requirements are minimal and a 3D hair model can be produced from an image in a second. Our evaluations also show that successful reconstructions are possible from highly stylized cartoon images, non-human subjects, and pictures taken from behind a person. Our approach is particularly well suited for continuous and plausible hair interpolation between very different hairstyles.
References:
1. Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, and James Davis. 2005. SCAPE: Shape Completion and Animation of People. ACM Trans. Graph. 24, 3 (2005), 408–416. Google ScholarDigital Library
2. Thabo Beeler, Bernd Bickel, Gioacchino Noris, Paul Beardsley, Steve Marschner, Robert W. Sumner, and Markus Gross. 2012. Coupled 3D Reconstruction of Sparse Facial Hair and Skin. ACM Trans. Graph. 31, 4 (2012), 117:1–117:10. Google ScholarDigital Library
3. Yoshua Bengio et al. 2009. Learning deep architectures for AI. Foundations and trends® in Machine Learning 2, 1 (2009), 1–127. Google ScholarDigital Library
4. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In SIGGRAPH ’99. 187–194. Google ScholarDigital Library
5. Andrew Brock, Theodore Lim, James M Ritchie, and Nick Weston. 2016. Generative and Discriminative Voxel Modeling with Convolutional Neural Networks. In 3D Deep Learning Workshop, Advances in neural information processing systems (NIPS). 1–9.Google Scholar
6. Neill D. F. Campbell and Jan Kautz. 2014. Learning a Manifold of Fonts. ACM Trans. Graph. 33, 4 (2014), 91:1–91:11. Google ScholarDigital Library
7. Joao Carreira, Pulkit Agrawal, Katerina Fragkiadaki, and Jitendra Malik. 2016. Human pose estimation with iterative error feedback. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4733–4742.Google ScholarCross Ref
8. Menglei Chai, Linjie Luo, Kalyan Sunkavalli, Nathan Carr, Sunil Hadap, and Kun Zhou. 2015. High-quality hair modeling from a single portrait photo. ACM Trans. Graph. 34, 6 (2015), 204:1–204:10. Google ScholarDigital Library
9. Menglei Chai, Tianjia Shao, Hongzhi Wu, Yanlin Weng, and Kun Zhou. 2016. Autohair: Fully automatic hair modeling from a single image. ACM Trans. Graph. 35, 4 (2016), 116:1–116:12. Google ScholarDigital Library
10. Menglei Chai, Lvdi Wang, Yanlin Weng, Xiaogang Jin, and Kun Zhou. 2013. Dynamic hair manipulation in images and videos. ACM Trans. Graph. 32, 4 (2013), 75. Google ScholarDigital Library
11. Menglei Chai, Lvdi Wang, Yanlin Weng, Yizhou Yu, Baining Guo, and Kun Zhou. 2012. Single-view hair modeling for portrait manipulation. ACM Trans. Graph. 31, 4 (2012), 116:1–116:8. Google ScholarDigital Library
12. Byoungwon Choe and Hyeong-Seok Ko. 2005. A statistical wisp model and pseudophysical approaches for interactive hairstyle generation. IEEE Transactions on Visualization and Computer Graphics 11, 2 (2005), 160–170. Google ScholarDigital Library
13. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 248–255.Google ScholarCross Ref
14. Jose I Echevarria, Derek Bradley, Diego Gutierrez, and Thabo Beeler. 2014. Capturing and stylizing hair for 3D fabrication. ACM Trans. Graph. 33, 4 (2014), 125. Google ScholarDigital Library
15. FaceUnity. 2017. http://www.faceunity.com/p2a-demo.mp4.Google Scholar
16. Hongbo Fu, Yichen Wei, Chiew-Lan Tai, and Long Quan. 2007. Sketching hairstyles. In Proceedings of the 4th Eurographics Workshop on Sketch-based Interfaces and Modeling. 31–36. Google ScholarDigital Library
17. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google ScholarCross Ref
18. Tomas Lay Herrera, Arno Zinke, and Andreas Weber. 2012. Lighting hair from the inside: A thermal approach to hair reconstruction. ACM Trans. Graph. 31, 6 (2012), 146:1–146:9. Google ScholarDigital Library
19. Liwen Hu, Chongyang Ma, Linjie Luo, and Hao Li. 2014a. Robust hair capture using simulated examples. ACM Trans. Graph. 33, 4 (2014), 126:1–126:10. Google ScholarDigital Library
20. Liwen Hu, Chongyang Ma, Linjie Luo, and Hao Li. 2015. Single-view hair modeling using a hairstyle database. ACM Trans. Graph. 34, 4 (2015), 125:1–125:9. Google ScholarDigital Library
21. Liwen Hu, Chongyang Ma, Linjie Luo, Li-Yi Wei, and Hao Li. 2014b. Capturing braided hairstyles. ACM Trans. Graph. 33, 6 (2014), 225:1–225:9. Google ScholarDigital Library
22. Liwen Hu, Shunsuke Saito, Lingyu Wei, Koki Nagano, Jaewoo Seo, Jens Fursund, Iman Sadeghi, Carrie Sun, Yen-Chun Chen, and Hao Li. 2017. Avatar Digitization from a Single Image for Real-time Rendering. ACM Trans. Graph. 36, 6 (2017), 195:1–195:14. Google ScholarDigital Library
23. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 5967–5976.Google ScholarCross Ref
24. itSeez3D: Avatar SDK. 2017. https://avatarsdk.com.Google Scholar
25. Aaron S Jackson, Adrian Bulat, Vasileios Argyriou, and Georgios Tzimiropoulos. 2017. Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression. In Proceedings of International Conference on Computer Vision. 1031–1039.Google ScholarCross Ref
26. Wenzel Jakob, Jonathan T Moon, and Steve Marschner. 2009. Capturing hair assemblies fiber by fiber. ACM Trans. Graph. 28, 5 (2009), 164:1–164:9. Google ScholarDigital Library
27. Angjoo Kanazawa, Michael J. Black, David W. Jacobs, and Jitendra Malik. 2018. End-to-end Recovery of Human Shape and Pose. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7122–7131.Google ScholarCross Ref
28. Tae-Yong Kim and Ulrich Neumann. 2002. Interactive Multiresolution Hair Modeling and Editing. ACM Trans. Graph. 21, 3 (2002), 620–629. Google ScholarDigital Library
29. Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of International Conference on Learning Representations (ICLR).Google Scholar
30. Diederik P Kingma and Max Welling. 2014. Auto-encoding variational bayes. In Proceedings of International Conference on Learning Representations (ICLR).Google Scholar
31. Hao Li, Laura Trutoiu, Kyle Olszewski, Lingyu Wei, Tristan Trutna, Pei-Lun Hsieh, Aaron Nicholls, and Chongyang Ma. 2015. Facial Performance Sensing Head-Mounted Display. ACM Trans. Graph. 34, 4 (2015), 47:1–47:9. Google ScholarDigital Library
32. Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Trans. Graph. 34, 6 (2015), 248:1–248:16. Google ScholarDigital Library
33. Linjie Luo, Hao Li, and Szymon Rusinkiewicz. 2013. Structure-aware hair capture. ACM Trans. Graph. 32, 4 (2013), 76:1–76:12. Google ScholarDigital Library
34. D. Maturana and S. Scherer. 2015. VoxNet: A 3D Convolutional Neural Network for real-time object recognition. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 922–928.Google Scholar
35. Myidol. 2017. http://en.faceii.com/.Google Scholar
36. Kyle Olszewski, Joseph J. Lim, Shunsuke Saito, and Hao Li. 2016. High-Fidelity Facial and Speech Animation for VR HMDs. ACM Trans. Graph. 35, 6 (2016), 221:1–221:14. Google ScholarDigital Library
37. Sylvain Paris, Hector M Briceño, and François X Sillion. 2004. Capture of hair geometry from multiple images. ACM Trans. Graph. 23, 3 (2004), 712–719. Google ScholarDigital Library
38. Sylvain Paris, Will Chang, Oleg I Kozhushnyan, Wojciech Jarosz, Wojciech Matusik, Matthias Zwicker, and Frédo Durand. 2008. Hair photobooth: geometric and photometric acquisition of real hairstyles. ACM Trans. Graph. 27, 3 (2008), 30:1–30:9. Google ScholarDigital Library
39. Pinscreen. 2017. http://www.pinscreen.com.Google Scholar
40. Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017a. PointNet: Deep learning on point sets for 3d classification and segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. 77–85.Google Scholar
41. Charles R Qi, Hao Su, Matthias Nießner, Angela Dai, Mengyuan Yan, and Leonidas J Guibas. 2016. Volumetric and Multi-View CNNs for Object Classification on 3D Data. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 5648–5656.Google ScholarCross Ref
42. Charles R Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017b. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Advances in Neural Information Processing Systems. 5099–5108. Google ScholarDigital Library
43. Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic back-propagation and approximate inference in deep generative models. In Proceedings of International Conference on International Conference on Machine Learning (ICML). 1278–1286. Google ScholarDigital Library
44. David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. 1985. Learning internal representations by error propagation. Technical Report. California Univ San Diego La Jolla Inst for Cognitive Science.Google Scholar
45. Shunsuke Saito, Tianye Li, and Hao Li. 2016. Real-Time Facial Segmentation and Performance Capture from RGB Input. In Proceedings of the European Conference on Computer Vision. 244–261.Google ScholarCross Ref
46. Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik Learned-Miller. 2015. Multi-view Convolutional Neural Networks for 3D Shape Recognition. In Proceedings of the IEEE International Conference on Computer Vision. 945–953. Google ScholarDigital Library
47. Qingyang Tan, Lin Gao, Yu-Kun Lai, and Shihong Xia. 2018. Variational Autoencoders for Deforming 3D Mesh Models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 5841–5850.Google ScholarCross Ref
48. Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2018. Facevr: Real-time facial reenactment and eye gaze control in virtual reality. ACM Trans. Graph. 37, 2 (2018), 25:1–25:15. Google ScholarDigital Library
49. Anh Tuan Tran, Tal Hassner, Iacopo Masi, and Gerard Medioni. 2017. Regressing Robust and Discriminative 3D Morphable Models with a very Deep Neural Network. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 1493–1502.Google ScholarCross Ref
50. Nobuyuki Umetani. 2017. Exploring Generative 3D Shapes Using Autoencoder Networks. In SIGGRAPH Asia 2017 Technical Briefs. 24:1–24:4. Google ScholarDigital Library
51. Lvdi Wang, Yizhou Yu, Kun Zhou, and Baining Guo. 2009. Example-based Hair Geometry Synthesis. ACM Trans. Graph. 28, 3 (2009), 56:1–56:9. Google ScholarDigital Library
52. Kelly Ward, Florence Bertails, Tae-Yong Kim, Stephen R Marschner, Marie-Paule Cani, and Ming C Lin. 2007. A survey on hair modeling: Styling, simulation, and rendering. IEEE Transactions on Visualization and Computer Graphics 13, 2 (2007), 213–234. Google ScholarDigital Library
53. Yichen Wei, Eyal Ofek, Long Quan, and Heung-Yeung Shum. 2005. Modeling Hair from Multiple Views. ACM Trans. Graph. 24, 3 (2005), 816–820. Google ScholarDigital Library
54. Yanlin Weng, Lvdi Wang, Xiao Li, Menglei Chai, and Kun Zhou. 2013. Hair interpolation for portrait morphing. Computer Graphics Forum 32, 7 (2013), 79–84.Google ScholarCross Ref
55. Jamie Wither, Florence Bertails, and Marie-Paule Cani. 2007. Realistic hair from a sketch. In IEEE International Conference on Shape Modeling and Applications. 33–42. Google ScholarDigital Library
56. Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 1912–1920.Google Scholar
57. Zexiang Xu, Hsiang-Tao Wu, Lvdi Wang, Changxi Zheng, Xin Tong, and Yue Qi. 2014. Dynamic Hair Capture Using Spacetime Optimization. ACM Trans. Graph. 33, 6 (2014), 224:1–224:11. Google ScholarDigital Library
58. Xuan Yu, Zhan Yu, Xiaogang Chen, and Jingyi Yu. 2014. A hybrid image-CAD based system for modeling realistic hairstyles. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D). 63–70. Google ScholarDigital Library
59. Cem Yuksel, Scott Schaefer, and John Keyser. 2009. Hair meshes. ACM Trans. Graph. 28, 5 (2009), 166:1–166:7. Google ScholarDigital Library
60. M Ersin Yumer and Niloy J Mitra. 2016. Learning Semantic Deformation Flows with 3D Convolutional Networks. In Proceedings of the European Conference on Computer Vision. 294–311.Google Scholar
61. Meng Zhang, Menglei Chai, Hongzhi Wu, Hao Yang, and Kun Zhou. 2017. Adata-driven approach to four-view image-based hair modeling. ACM Trans. Graph. 36, 4 (2017), 156:1–156:11. Google ScholarDigital Library
62. Xiangyu Zhang, Jianhua Zou, Kaiming He, and Jian Sun. 2016. Accelerating very deep convolutional networks for classification and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 10 (2016), 1943–1955. Google ScholarDigital Library
63. Yi Zhou, Liwen Hu, Jun Xin, Weikai Chen, Han-Wei Kung, Xin Tong, and Hao Li. 2018. HairNet: Single-View Hair Reconstruction using Convolutional Neural Networks. In Proceedings of the European Conference on Computer Vision. 235–251.Google ScholarCross Ref


