SP-GAN: sphere-guided 3D shape generation and manipulation

Ruihui Li; Xianzhi Li; Ka-Hei Hui; Chi-Wing Fu

“SP-GAN: sphere-guided 3D shape generation and manipulation” by Li, Li, Hui and Fu

Next: “SP3D and the Lighthouse: Explorations in 3D... »

« Previous: “Souvenirs du Monde des Montagnes” by...

Conference:

SIGGRAPH 2021

Type(s):

Technical Papers

Title:

SP-GAN: sphere-guided 3D shape generation and manipulation

Presenter(s)/Author(s):

Ruihui Li

Xianzhi Li

Ka-Hei Hui

Chi-Wing Fu

Abstract:

We present SP-GAN, a new unsupervised sphere-guided generative model for direct synthesis of 3D shapes in the form of point clouds. Compared with existing models, SP-GAN is able to synthesize diverse and high-quality shapes with fine details and promote controllability for part-aware shape generation and manipulation, yet trainable without any parts annotations. In SP-GAN, we incorporate a global prior (uniform points on a sphere) to spatially guide the generative process and attach a local prior (a random latent code) to each sphere point to provide local details. The key insight in our design is to disentangle the complex 3D shape generation task into a global shape modeling and a local structure adjustment, to ease the learning process and enhance the shape generation quality. Also, our model forms an implicit dense correspondence between the sphere points and points in every generated shape, enabling various forms of structure-aware shape manipulations such as part editing, part-wise shape interpolation, and multi-shape part composition, etc., beyond the existing generative models. Experimental results, which include both visual and quantitative evaluations, demonstrate that our model is able to synthesize diverse point clouds with fine details and less noise, as compared with the state-of-the-art models.

References:

1. Kfir Aberman, Oren Katzir, Qiang Zhou, Zegang Luo, Andrei Sharf, Chen Greif, Baoquan Chen, and Daniel Cohen-Or. 2017. Dip transform for 3D shape reconstruction. ACM Transactions on Graphics (SIGGRAPH) 36, 4 (2017), 79:1–79:11.Google ScholarDigital Library
2. Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, and Leonidas J. Guibas. 2018. Learning representations and generative models for 3D point clouds. In Proceedings of International Conference on Machine Learning (ICML). 40–49.Google Scholar
3. Mohammad Samiul Arshad and William J. Beksi. 2020. A progressive conditional generative adversarial network for generating dense and colored 3D point clouds. In International Conference on 3D Vision (3DV).Google Scholar
4. Ruojin Cai, Guandao Yang, Hadar Averbuch-Elor, Zekun Hao, Serge Belongie, Noah Snavely, and Bharath Hariharan. 2020. Learning gradient fields for shape generation. In European Conference on Computer Vision (ECCV).Google ScholarDigital Library
5. Angel X. Chang, Thomas Funkhouser, Leonidas J. Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015).Google Scholar
6. Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5939–5948.Google ScholarCross Ref
7. Yu Deng, Jiaolong Yang, and Xin Tong. 2021. Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
8. Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. 2016. Density estimation using real NVP. In International Conference on Learning Representations (ICLR).Google Scholar
9. Anastasia Dubrovina, Fei Xia, Panos Achlioptas, Mira Shalah, Raphaël Groscot, and Leonidas J. Guibas. 2019. Composite shape modeling via latent space factorization. In IEEE International Conference on Computer Vision (ICCV). 8140–8149.Google Scholar
10. Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. 2017. A learned representation for artistic style. In International Conference on Learning Representations (ICLR).Google Scholar
11. Rinon Gal, Amit Bermano, Hao Zhang, and Daniel Cohen-Or. 2020. MRGAN: Multi-Rooted 3D Shape Generation with Unsupervised Part Disentanglement. arXiv preprint arXiv:2007.12944 (2020).Google Scholar
12. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Conference on Neural Information Processing Systems (NeurIPS). 2672–2680.Google Scholar
13. Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan C. Russell, and Mathieu Aubry. 2018. A papier-mâché approach to learning 3D surface generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 216–224.Google ScholarCross Ref
14. Kaiwen Guo, Feng Xu, Tao Yu, Xiaoyang Liu, Qionghai Dai, and Yebin Liu. 2017. Realtime geometry, albedo, and motion reconstruction using a single RGB-D camera. ACM Transactions on Graphics (SIGGRAPH) 36, 3 (2017), 32:1–32:13.Google ScholarDigital Library
15. Rana Hanocka, Gal Metzer, Raja Giryes, and Daniel Cohen-Or. 2020. Point2Mesh: A Self-Prior for Deformable Meshes. ACM Transactions on Graphics (SIGGRAPH) 39, 4 (2020), 126:1–126:12.Google ScholarDigital Library
16. Le Hui, Rui Xu, Jin Xie, Jianjun Qian, and Jian Yang. 2020. Progressive point cloud deconvolution generation network. In European Conference on Computer Vision (ECCV).Google ScholarDigital Library
17. Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4401–4410.Google ScholarCross Ref
18. Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Joun Yeop Lee, and Nam Soo Kim. 2020. SoftFlow: Probabilistic framework for normalizing flow on manifolds. In Conference on Neural Information Processing Systems (NeurIPS).Google Scholar
19. Roman Klokov, Edmond Boyer, and Jakob Verbeek. 2020. Discrete point flow networks for efficient point cloud generation. In European Conference on Computer Vision (ECCV).Google ScholarDigital Library
20. Vladimir A. Knyaz, Vladimir V Kniaz, and Fabio Remondino. 2018. Image-to-voxel model translation with conditional adversarial networks. In European Conference on Computer Vision (ECCV).Google Scholar
21. Xiao Li, Yue Dong, Pieter Peers, and Xin Tong. 2017. Modeling surface appearance from a single photograph using self-augmented convolutional neural networks. ACM Transactions on Graphics (SIGGRAPH) 36, 4 (2017), 45:1–45:11.Google ScholarDigital Library
22. Shi-Lin Liu, Hao-Xiang Guo, Hao Pan, Pengshuai Wang, Xin Tong, and Yang Liu. 2021. Deep Implicit Moving Least-Squares Functions for 3D Reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
23. Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A skinned multi-person linear model. ACM Transactions on Graphics (SIGGRAPH Asia) 34, 6 (2015), 248:1–248:16.Google Scholar
24. Xudong Mao, Qing Li, Haoran Xie, Raymond Y.K. Lau, Zhen Wang, and Stephen Paul Smolley. 2017. Least squares generative adversarial networks. In IEEE International Conference on Computer Vision (ICCV). 2794–2802.Google ScholarCross Ref
25. Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy networks: Learning 3D reconstruction in function space. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4460–4470.Google ScholarCross Ref
26. Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, and Leonidas J. Guibas. 2019. StructureNet: Hierarchical graph networks for 3D shape generation. ACM Transactions on Graphics (SIGGRAPH Asia) 38, 6 (2019), 242:1–242:19.Google Scholar
27. Kaichun Mo, He Wang, Xinchen Yan, and Leonidas J. Guibas. 2020. PT2PC: Learning to generate 3D point cloud shapes from part tree conditions. In European Conference on Computer Vision (ECCV).Google Scholar
28. Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. DeepSDF: Learning continuous signed distance functions for shape representation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 165–174.Google ScholarCross Ref
29. Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017a. PointNet: Deep learning on point sets for 3D classification and segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 652–660.Google Scholar
30. Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017b. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In Conference on Neural Information Processing Systems (NeurIPS). 5099–5108.Google Scholar
31. Sameera Ramasinghe, Salman Khan, Nick Barnes, and Stephen Gould. 2019. Spectral-GANs for high-resolution 3D point-cloud generation. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 8169–8176.Google Scholar
32. Edgar Schonfeld, Bernt Schiele, and Anna Khoreva. 2020. A U-Net based discriminator for generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 8207–8216.Google ScholarCross Ref
33. Dong Wook Shu, Sung Woo Park, and Junseok Kwon. 2019. 3D point cloud generative adversarial network based on tree structured graph convolutions. In IEEE International Conference on Computer Vision (ICCV). 3859–3868.Google ScholarCross Ref
34. Ayan Sinha, Asim Unmesh, Qixing Huang, and Karthik Ramani. 2017. SurfNet: Generating 3D shape surfaces using deep residual networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6040–6049.Google ScholarCross Ref
35. Edward J. Smith and David Meger. 2017. Improved adversarial systems for 3D object generation and reconstruction. In Conference on Robot Learning. PMLR, 87–96.Google Scholar
36. Yongbin Sun, Yue Wang, Ziwei Liu, Joshua Siegel, and Sanjay Sarma. 2020. PointGrow: Autoregressively learned point cloud generation with self-attention. In The IEEE Winter Conference on Applications of Computer Vision (WACV). 61–70.Google ScholarCross Ref
37. Michael Waechter, Mate Beljan, Simon Fuhrmann, Nils Moehrle, Johannes Kopf, and Michael Goesele. 2017. Virtual rephotography: Novel view prediction error for 3D reconstruction. ACM Transactions on Graphics 36, 1 (2017), 8:1–8:11.Google ScholarDigital Library
38. Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. 2018. Pixel2Mesh: Generating 3D mesh models from single RGB images. In European Conference on Computer Vision (ECCV). 52–67.Google ScholarDigital Library
39. Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. 2019. Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics 38, 5 (2019), 146:1–146:12.Google ScholarDigital Library
40. Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, Bill Freeman, and Josh Tenenbaum. 2017. MarrNet: 3D shape reconstruction via 2.5D sketches. In Conference on Neural Information Processing Systems (NeurIPS). 540–550.Google Scholar
41. Jiajun Wu, Chengkai Zhang, Tianfan Xue, Bill Freeman, and Josh Tenenbaum. 2016. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In Conference on Neural Information Processing Systems (NeurIPS). 82–90.Google Scholar
42. Rundi Wu, Yixin Zhuang, Kai Xu, Hao Zhang, and Baoquan Chen. 2020. PQ-NET: A generative part Seq2Seq network for 3D shapes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 829–838.Google ScholarCross Ref
43. Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1912–1920.Google Scholar
44. Bo Yang, Stefano Rosa, Andrew Markham, Niki Trigoni, and Hongkai Wen. 2018. Dense 3D object reconstruction from a single depth view. IEEE Transactions Pattern Analysis & Machine Intelligence 41, 12 (2018), 2820–2834.Google ScholarCross Ref
45. Guandao Yang, Xun Huang, Zekun Hao, Ming-Yu Liu, Serge Belongie, and Bharath Hariharan. 2019. PointFlow: 3D point cloud generation with continuous normalizing flows. In IEEE International Conference on Computer Vision (ICCV). 4541–4550.Google ScholarCross Ref
46. Kangxue Yin, Zhiqin Chen, Hui Huang, Daniel Cohen-Or, and Hao Zhang. 2019. LOGAN: Unpaired shape transform in latent overcomplete space. ACM Transactions on Graphics (SIGGRAPH Asia) 38, 6 (2019), 198:1–198:13.Google Scholar
47. Kangxue Yin, Hui Huang, Daniel Cohen-Or, and Hao Zhang. 2018. P2P-Net: Bidirectional point displacement net for shape transform. ACM Transactions on Graphics (SIGGRAPH) 37, 4 (2018), 152:1–152:13.Google ScholarDigital Library
48. Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, and Martial Hebert. 2018. PCN: Point completion network. In International Conference on 3D Vision (3DV). 728–737.Google ScholarCross Ref
49. Zerong Zheng, Tao Yu, Yixuan Wei, Qionghai Dai, and Yebin Liu. 2019. DeepHuman: 3D human reconstruction from a single image. In IEEE International Conference on Computer Vision (ICCV). 7739–7749.Google ScholarCross Ref
50. Silvia Zuffi, Angjoo Kanazawa, David Jacobs, and Michael J. Black. 2017. 3D Menagerie: modeling the 3D shape and pose of animals. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5524–5532.Google Scholar

ACM Digital Library Publication: