“GRASS: generative recursive autoencoders for shape structures” by Li, Xu, Chaudhuri, Yumer, Zhang, et al. …

  • ©Jun Liu, Kai Xu, Siddhartha Chaudhuri, Ersin Yumer, Hao (Richard) Zhang, and Leonidas (Leo) J. Guibas

Conference:


Type:


Title:

    GRASS: generative recursive autoencoders for shape structures

Session/Category Title: Comparing 3D Shapes and Part


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    We introduce a novel neural network architecture for encoding and synthesis of 3D shapes, particularly their structures. Our key insight is that 3D shapes are effectively characterized by their hierarchical organization of parts, which reflects fundamental intra-shape relationships such as adjacency and symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a flat, unlabeled, arbitrary part layout to a compact code. The code effectively captures hierarchical structures of man-made 3D objects of varying structural complexities despite being fixed-dimensional: an associated decoder maps a code back to a full hierarchy. The learned bidirectional mapping is further tuned using an adversarial setup to yield a generative model of plausible structures, from which novel structures can be sampled. Finally, our structure synthesis framework is augmented by a second trained module that produces fine-grained part geometry, conditioned on global and local structural context, leading to a full generative pipeline for 3D shapes. We demonstrate that without supervision, our network learns meaningful structural hierarchies adhering to perceptual grouping principles, produces compact codes which enable applications such as shape classification and partial matching, and supports shape synthesis and interpolation with significant variations in topology and geometry.

References:


    1. Ibraheem Alhashim, Honghua Li, Kai Xu, Junjie Cao, Rui Ma, and Hao Zhang. 2014. Topology-Varying 3D Shape Creation via Structural Blending. In Proc. SIGGRAPH. Google ScholarDigital Library
    2. Brett Allen, Brian Curless, and Zoran Popović. 2003. The Space of Human Body Shapes: Reconstruction and Parameterization from Range Scans. In Proc. SIGGRAPH. Google ScholarDigital Library
    3. Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, and James Davis. 2005. SCAPE: Shape Completion and Animation of People. In Proc. SIGGRAPH. Google ScholarDigital Library
    4. Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein GAN. arXiv preprint arXiv:1701.07875 (2017).Google Scholar
    5. Melinos Averkiou, Vladimir Kim, Youyi Zheng, and Niloy J. Mitra. 2014. ShapeSynth: Parameterizing Model Collections for Coupled Shape Exploration and Synthesis. EUROGRAPHICS (2014).Google Scholar
    6. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In Proc. SIGGRAPH. 187–194. Google ScholarDigital Library
    7. Martin Bokeloh, Michael Wand, and Hans-Peter Seidel. 2010. A Connection Between Partial Symmetry and Inverse Procedural Modeling. In Proc. SIGGRAPH. Google ScholarDigital Library
    8. Siddhartha Chaudhuri, Evangelos Kalogerakis, Leonidas Guibas, and Vladlen Koltun. 2011. Probabilistic Reasoning for Assembly-Based 3D Modeling. In Proc. SIGGRAPH. Google ScholarDigital Library
    9. David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Al’an Aspuru-Guzik, and Ryan P. Adams. 2015. Convolutional Networks on Graphs for Learning Molecular Fingerprints. In Proc. NIPS.Google Scholar
    10. Noa Fish, Melinos Averkiou, Oliver van Kaick, Olga Sorkine-Hornung, Daniel Cohen-Or, and Niloy J. Mitra. 2014. Meta-representation of Shape Families. In Proc. SIGGRAPH. Google ScholarDigital Library
    11. Rohit Girdhar, David F Fouhey, Mikel Rodriguez, and Abhinav Gupta. 2016. Learning a predictable and generative vector representation for objects. In Proc. ECCV. Google ScholarCross Ref
    12. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proc. NIPS.Google ScholarDigital Library
    13. Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep Convolutional Networks on Graph-Structured Data. CoRR abs/1506.05163 (2015). http://arxiv.org/abs/1506.05163Google Scholar
    14. Haibin Huang, Evangelos Kalogerakis, and Benjamin Marlin. 2015. Analysis and synthesis of 3D shape families via deep-learned generative models of surfaces. In Proc. SGP.Google ScholarCross Ref
    15. Arjun Jain, Thorsten Thormählen, Tobias Ritschel, and Hans-Peter Seidel. 2012. Exploring Shape Variations by 3D-Model Decomposition and Part-based Recombination. In EUROGRAPHICS.Google Scholar
    16. Evangelos Kalogerakis, Siddhartha Chaudhuri, Daphne Koller, and Vladlen Koltun. 2012. A probabilistic model for component-based shape synthesis. ACM Trans. Graph. (Proc. SIGGRAPH) 31, 4 (2012).Google Scholar
    17. Vladimir G. Kim, Wilmot Li, Niloy J. Mitra, Siddhartha Chaudhuri, Stephen DiVerdi, and Thomas Funkhouser. 2013. Learning Part-based Templates from Large Collections of 3D Shapes. In Proc. SIGGRAPH. Google ScholarDigital Library
    18. W. Köhler. 1929. Gestalt Psychology. Liveright.Google Scholar
    19. Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, and Ole Winther. 2015. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300 (2015).Google Scholar
    20. Jonathan Masci, Davide Boscaini, Michael M. Bronstein, and Pierre Vandergheynst. 2015. Geodesic convolutional neural networks on Riemannian manifolds. In ICCV Workshops. Google ScholarDigital Library
    21. Niloy Mitra, Michael Wand, Hao Zhang, Daniel Cohen-Or, and Martin Bokeloh. 2013. Structure-aware shape processing. In Eurographics State-of-the-art Report (STAR). Google ScholarDigital Library
    22. Pascal Müller, Peter Wonka, Simon Haegler, Andreas Ulmer, and Luc Van Gool. 2006. Procedural Modeling of Buildings. In Proc. SIGGRAPH. Google ScholarDigital Library
    23. Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning Convolutional Neural Networks for Graphs. In Proc. ICML.Google Scholar
    24. Maks Ovsjanikov, Wilmot Li, Leonidas Guibas, and Niloy J. Mitra. 2011. Exploration of Continuous Variability in Collections of 3D Shapes. In Proc. SIGGRAPH. Google ScholarDigital Library
    25. Charles R. Qi, Hao Su, Matthias Niessner, Angela Dai, Mengyuan Yan, and Leonidas J Guibas. 2016. Volumetric and multi-view CNNs for object classification on 3D data. In Proc. CVPR. Google ScholarCross Ref
    26. Adriana Schulz, Ariel Shamir, Ilya Baran, David Isaac William Levin, Pitchaya Sitthi-Amorn, and Wojciech Matusik. 2016. Retrieval on Parametric Shape Collections. ACM Trans. Graph. (to appear) (2016).Google Scholar
    27. Thomas Serre. 2013. Hierarchical Models of the Visual System. In Encyclopedia of Computational Neuroscience, Dieter Jaeger and Ranu Jung (Eds.). Springer NY.Google Scholar
    28. Ayan Sinha, Jing Bai, and Karthik Ramani. 2016. Deep Learning 3D Shape Surfaces using Geometry Images. In Proc. ECCV. Google ScholarCross Ref
    29. Richard Socher. 2014. Recursive Deep Learning for Natural Language Processing and Computer Vision. Ph.D. Dissertation. Stanford University.Google Scholar
    30. Richard Socher, Brody Huval, Bharath Bhat, Christopher D. Manning, and Andrew Y. Ng. 2012. Convolutional-Recursive Deep Learning for 3D Object Classification. In Proc. NIPS.Google Scholar
    31. Richard Socher, Cliff C. Lin, Andrew Y. Ng, and Christopher D. Manning. 2011. Parsing Natural Scenes and Natural Language with Recursive Neural Networks. In Proc. ICML.Google Scholar
    32. Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik Learned-Miller. 2015. Multi-view convolutional neural networks for 3D shape recognition. In Proc. ICCV. Google ScholarDigital Library
    33. Jerry Talton, Lingfeng Yang, Ranjitha Kumar, Maxine Lim, Noah Goodman, and Radomír Měch. 2012. Learning Design Patterns with Bayesian Grammar Induction. In Proc. UIST. 63–74. Google ScholarDigital Library
    34. Jerry O. Talton, Daniel Gibson, Lingfeng Yang, Pat Hanrahan, and Vladlen Koltun. 2009. Exploratory Modeling with Collaborative Design Spaces. In Proc. SIGGRAPH Asia. Google ScholarDigital Library
    35. Shubham Tulsiani, Hao Su, Leonidas J. Guibas, Alexei A. Efros, and Jitendra Malik. 2017. Learning Shape Abstractions by Assembling Volumetric Primitives. In Proc. CVPR.Google ScholarCross Ref
    36. Aäron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W. Senior, and Koray Kavukcuoglu. 2016a. WaveNet: A Generative Model for Raw Audio. CoRR abs/1609.03499 (2016). http://arxiv.org/abs/1609.03499Google Scholar
    37. Aäron van den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016b. Pixel Recurrent Neural Networks. CoRR abs/1601.06759 (2016). http://arxiv.org/abs/1601.06759Google Scholar
    38. Aäron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016c. Conditional Image Generation with PixelCNN Decoders. CoRR abs/1606.05328 (2016). http://arxiv.org/abs/1606.05328Google Scholar
    39. Oliver van Kaick, Kai Xu, Hao Zhang, Yanzhen Wang, Shuyang Sun, Ariel Shamir, and Daniel Cohen-Or. 2013. Co-Hierarchical Analysis of Shape Structures. In Proc. SIGGRAPH. Google ScholarDigital Library
    40. Yanzhen Wang, Kai Xu, Jun Li, Hao Zhang, Ariel Shamir, Ligang Liu, Zhiquan Cheng, and Yueshan Xiong. 2011. Symmetry Hierarchy of Man-Made Objects. In EUROGRAPHICS. Google ScholarCross Ref
    41. Jiajun Wu, Chengkai Zhang, Tianfan Xue, William T. Freeman, and Joshua B. Tenenbaum. 2016. Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling. In Proc. NIPS.Google Scholar
    42. Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. In Proc. CVPR.Google Scholar
    43. Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, and Honglak Lee. 2016. Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision. In Proc. NIPS.Google Scholar
    44. M. E. Yumer and N. J. Mitra. 2016. Learning Semantic Deformation Flows with 3D Convolutional Networks. In Proc. ECCV. Google ScholarCross Ref


ACM Digital Library Publication: