“Deep convolutional priors for indoor scene synthesis” by Wang, Savva, Chang and Ritchie

  • ©Kai Wang, Manolis Savva, Angel X. Chang, and Daniel Ritchie



Entry Number: 70


    Deep convolutional priors for indoor scene synthesis

Session/Category Title:   Image & Shape Analysis With CNNs




    We present a convolutional neural network based approach for indoor scene synthesis. By representing 3D scenes with a semantically-enriched image-based representation based on orthographic top-down views, we learn convolutional object placement priors from the entire context of a room. Our approach iteratively generates rooms from scratch, given only the room architecture as input. Through a series of perceptual studies we compare the plausibility of scenes generated using our method against baselines for object selection and object arrangement, as well as scenes modeled by people. We find that our method generates scenes that are preferred over the baselines, and in some cases are equally preferred to human-created scenes.


    1. Andrej Karpathy. 2015. char-rnn. https://github.com/karpathy/char-rnn. (2015). Accessed: 2018-01-20.Google Scholar
    2. Angel X Chang, Manolis Savva, and Christopher D Manning. 2014. Learning Spatial Knowledge for Text to 3D Scene Generation. In Empirical Methods in Natural Language Processing (EMNLP).Google Scholar
    3. Kang Chen, Yukun Lai, Yu-Xin Wu, Ralph Robert Martin, and Shi-Min Hu. 2014. Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Transactions on Graphics 33, 6 (2014). Google ScholarDigital Library
    4. Kang Chen, Kun Xu, Yizhou Yu, Tian-Yi Wang, and Shi-Min Hu. 2015. Magic Decorator: Automatic Material Suggestion for Indoor Digital Scenes. In SIGGRAPH Asia 2015. Google ScholarDigital Library
    5. B. Efron and R. Tibshirani. 1986. Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy. Statist. Sci. 1, 1 (02 1986), 54–75.Google Scholar
    6. Kevin Ellis, Daniel Ritchie, Armando Solar-Lezama, and Joshua B. Tenenbaum. 2017. Learning to Infer Graphics Programs from Hand-Drawn Images. CoRR arXiv:1707.09627 (2017).Google Scholar
    7. S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, and Geoffrey E. Hinton. 2016. Attend, Infer, Repeat: Fast Scene Understanding with Generative Models. In NIPS 2016. Google ScholarDigital Library
    8. Matthew Fisher, Daniel Ritchie, Manolis Savva, Thomas Funkhouser, and Pat Hanrahan. 2012. Example-based Synthesis of 3D Object Arrangements. In SIGGRAPH Asia 2012. Google ScholarDigital Library
    9. Matthew Fisher, Manolis Savva, Yangyan Li, Pat Hanrahan, and Matthias Nießner. 2015. Activity-centric Scene Synthesis for Functional 3D Scene Modeling. (2015).Google Scholar
    10. Qiang Fu, Xiaowu Chen, Xiaotian Wang, Sijia Wen, Bin Zhou, and Hongbo Fu. 2017. Adaptive Synthesis of Indoor Scenes via Activity-associated Object Relation Graphs. In SIGGRAPH Asia 2017. Google ScholarDigital Library
    11. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NIPS 2014. Google ScholarDigital Library
    12. Karol Gregor, Ivo Danihelka, Alex Graves, and Daan Wierstra. 2015. DRAW: A Recurrent Neural Network For Image Generation. In ICML 2015. Google ScholarDigital Library
    13. David Ha and Douglas Eck. 2017. A Neural Representation of Sketch Drawings. CoRR arXiv:1704.03477 (2017).Google Scholar
    14. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR 2016.Google Scholar
    15. Paul Henderson and Vittorio Ferrari. 2017. A Generative Model of 3D Object Layouts in Apartments. CoRR arXiv:1711.10939 (2017). H. Huang, E. Kalogerakis, S. Chaudhuri, D. Ceylan, V. Kim, and E. Yumer. 2017. Learning Local Shape Descriptors with View-based Convolutional Neural Networks. ACM Transactions on Graphics (2017).Google Scholar
    16. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image Translation with Conditional Adversarial Networks. In CVPR 2017.Google Scholar
    17. Evangelos Kalogerakis, Melinos Averkiou, Subhransu Maji, and Siddhartha Chaudhuri. 2017. 3D Shape Segmentation with Projective Convolutional Networks. In CVPR 2017.Google ScholarCross Ref
    18. Z. Sadeghipour Kermani, Z. Liao, P. Tan, and H. Zhang. 2016. Learning 3D Scene Synthesis from Annotated RGB-D Images. In Eurographics Symposium on Geometry Processing. Google ScholarDigital Library
    19. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR 2015.Google Scholar
    20. Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR 2014.Google Scholar
    21. Jun Li, Kai Xu, Siddhartha Chaudhuri, Ersin Yumer, Hao Zhang, and Leonidas Guibas. 2017. GRASS: Generative Recursive Autoencoders for Shape Structures. In SIGGRAPH 2017. Google ScholarDigital Library
    22. Yuan Liang, Song-Hai Zhang, and Ralph Robert Martin. 2017. Automatic Data-Driven Room Design Generation. In Next Generation Computer Animation Techniques: Third International Workshop (AniNex 2017), Jiana Chang, Jian Jun Zhang, Nadia Magnenat Thalmann, Shi-Min Hu, Ruofeng Tong, and Wencheng Wang (Eds.).Google Scholar
    23. Isaak Lim, Anne Gehre, and Leif Kobbelt. 2016. Identifying Style of 3D Shapes using Deep Metric Learning. In Eurographics Symposium on Geometry Processing. Google ScholarDigital Library
    24. Tianqiang Liu, Aaron Hertzmann, Wilmot Li, and Thomas Funkhouser. 2015. Style Compatibility for 3D Furniture Models. In SIGGRAPH 2015. Google ScholarDigital Library
    25. Paul Merrell, Eric Schkufza, Zeyang Li, Maneesh Agrawala, and Vladlen Koltun. 2011. Interactive Furniture Layout Using Interior Design Guidelines. In SIGGRAPH 2011. Google ScholarDigital Library
    26. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR arXiv:1301.3781 (2013).Google Scholar
    27. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).Google Scholar
    28. Xue Bin Peng, Glen Berseth, Kangkang Yin, and Michiel Van De Panne. 2017. DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning. In SIGGRAPH 2017.Google ScholarDigital Library
    29. Planner5d. 2017. Home Design Software and Interior Design Tool ONLINE for home and floor plans in 2D and 3D. https://planner5d.com. (2017). Accessed: 2017-10-20.Google Scholar
    30. Siyuan Qi, Yixin Zhu, Siyuan Huang, Chenfanfu Jiang, and Song-Chun Zhu. 2018. Human-centric Indoor Scene Synthesis Using Stochastic Grammar. In Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
    31. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In NIPS 2015. Google ScholarDigital Library
    32. Daniel Ritchie, Anna Thomas, Pat Hanrahan, and Noah D. Goodman. 2016. Neurally-Guided Procedural Models: Amortized Inference for Procedural Graphics Programs using Neural Networks. In NIPS 2016. Google ScholarDigital Library
    33. Manolis Savva, Angel X. Chang, Pat Hanrahan, Matthew Fisher, and Matthias Nießner. 2014. SceneGrok: Inferring Action Maps in 3D Environments. In SIGGRAPH Asia 2014. Google ScholarDigital Library
    34. Gopal Sharma, Rishabh Goyal, Difan Liu, Evangelos Kalogerakis, and Subhransu Maji. 2017. CSGNet: Neural Shape Parser for Constructive Solid Geometry. CoRR arXiv:1712.08290 (2017).Google Scholar
    35. Shuran Song, Fisher Yu, Andy Zeng, Angel X Chang, Manolis Savva, and Thomas Funkhouser. 2017. Semantic Scene Completion from a Single Depth Image. CVPR 2017.Google ScholarCross Ref
    36. Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik G. Learned-Miller. 2015. Multi-view convolutional neural networks for 3d shape recognition. In ICCV 2015. Google ScholarDigital Library
    37. Minhyuk Sung, Hao Su, Vladimir G. Kim, Siddhartha Chaudhuri, and Leonidas J. Guibas. 2017. ComplementMe: Weakly-Supervised Component Suggestions for 3D Modeling. In SIGGRAPH Asia 2017. Google ScholarDigital Library
    38. Benigno Uria, Marc-Alexandre Cote, Karol Gregor, Iain Murray, and Hugo Larochelle. 2016. Neural Autoregressive Distribution Estimation. CoRR arXiv:1605.02226 (2016). Google ScholarDigital Library
    39. Aaron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016).Google Scholar
    40. Aaron van den Oord, Nal Kalchbrenner, Lasse Espeholt, Oriol Vinyals, Alex Graves, et al. 2016. Conditional image generation with pixelcnn decoders. In Advances in Neural Information Processing Systems. 4790–4798. Google ScholarDigital Library
    41. Kun Xu, Kang Chen, Hongbo Fu, Wei-Lun Sun, and Shi-Min Hu. 2013. Sketch2Scene: Sketch-based Co-retrieval and Co-placement of 3D Models. In SIGGRAPH 2013. Google ScholarDigital Library
    42. Ken Xu, James Stewart, and Eugene Fiume. 2002. Constraint-based automatic placement for scene composition. In Graphics Interface, Vol. 2. 25–34.Google Scholar
    43. Yi-Ting Yeh, Lingfeng Yang, Matthew Watson, Noah D. Goodman, and Pat Hanrahan. 2012. Synthesizing Open Worlds with Constraints Using Locally Annealed Reversible Jump MCMC. In SIGGRAPH 2012. Google ScholarDigital Library
    44. Lap-Fai Yu, Sai-Kit Yeung, Chi-Keung Tang, Demetri Terzopoulos, Tony F. Chan, and Stanley J. Osher. 2011. Make It Home: Automatic Optimization of Furniture Arrangement. In SIGGRAPH 2011. Google ScholarDigital Library
    45. Richard Zhang, Phillip Isola, and Alexei A Efros. 2016. Colorful Image Colorization. In ECCV 2016.Google Scholar
    46. Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros, Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems. 465–476.Google Scholar
    47. C. Zou, E. Yumer, J. Yang, D. Ceylan, and D. Hoiem. 2017. 3D-PRNN: Generating Shape Primitives with Recurrent Neural Networks. In ICCV 2017.Google Scholar

ACM Digital Library Publication:

Overview Page: