HAISOR: Human-Aware Indoor Scene Optimization via Deep Reinforcement Learning

HAISOR proposes a pipeline to use deep reinforcement learning and Monte Carlo tree search to solve indoor scene optimization problem incorporating human behavior including human-furniture interaction and free space of activities that is not differentiable.

References:

[1]
Joshua Achiam, David Held, Aviv Tamar, and Pieter Abbeel. 2017. Constrained policy optimization. In 34th International Conference on Machine Learning, Vol. 70. 22?31.

[2]
Alekh Agarwal, Sham M. Kakade, Jason D. Lee, and Gaurav Mahajan. 2020. Optimality and approximation with policy gradient methods in Markov decision processes. In Conference on Learning Theory, Vol. 125. 64?66.

[3]
Marcin Andrychowicz, Misha Denil, Sergio Gomez Colmenarejo, Matthew W. Hoffman, David Pfau, Tom Schaul, and Nando de Freitas. 2016. Learning to learn by gradient descent by gradient descent. In Conference on Advances in Neural Information Processing Systems. 3981?3989.

[4]
Fan Bai, Fei Meng, Jianbang Liu, Jiankun Wang, and Max Q.-H. Meng. 2021. Hierarchical policy for non-prehensile multi-object rearrangement with deep reinforcement learning and Monte Carlo tree search. CoRR abs/2109.08973 (2021).

[5]
Harry G. Barrow, Jay M. Tenenbaum, Robert C. Bolles, and Helen C. Wolf. 1977. Parametric correspondence and chamfer matching: Two new techniques for image matching. In Image Understanding Workshop. 21?27.

[6]
Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemyslaw Debiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Christopher Hesse, Rafal J?zefowicz, Scott Gray, Catherine Olsson, Jakub Pachocki, Michael Petrov, Henrique Pond? de Oliveira Pinto, Jonathan Raiman, Tim Salimans, Jeremy Schlatter, Jonas Schneider, Szymon Sidor, Ilya Sutskever, Jie Tang, Filip Wolski, and Susan Zhang. 2019. Dota 2 with large scale deep reinforcement learning. CoRR abs/1912.06680 (2019).

[7]
Bryce Blinn, Alexander Ding, Daniel Ritchie, R. Kenny Jones, Srinath Sridhar, and Manolis Savva. 2021. Learning body-aware 3D shape generative models. CoRR abs/2112.07022 (2021).

[8]
Guillaume Chaslot, Sander Bakkes, Istvan Szita, and Pieter Spronck. 2008. Monte-Carlo tree search: A new framework for game AI. In 4th Artificial Intelligence and Interactive Digital Entertainment Conference. 216?217.

[9]
Cheng Chen, Weinan Zhang, and Yong Yu. 2022. Efficient policy evaluation by matrix sketching. Front. Comput. Sci.5 (2022).

[10]
Micha?l Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Conference on Advances in Neural Information Processing Systems. 3837?3845.

[11]
Matthew Fisher, Daniel Ritchie, Manolis Savva, Thomas Funkhouser, and Pat Hanrahan. 2012. Example-based synthesis of 3D object arrangements. ACM Trans. Graph. 31, 6 (2012), 135:1?135:11.

[12]
Huan Fu, Rongfei Jia, Lin Gao, Mingming Gong, Binqiang Zhao, Steve Maybank, and Dacheng Tao. 2021. 3d-future: 3d furniture shape with texture. International Journal of Computer Vision 129, 12 (2021), 3313?3337.

[13]
Qiang Fu, Xiaowu Chen, Xiaotian Wang, Sijia Wen, Bin Zhou, and Hongbo Fu. 2017. Adaptive synthesis of indoor scenes via activity-associated object relation graphs. ACM Trans. Graph. 36, 6 (2017), 201:1?201:13.

[14]
Joshua A. Haustein, Jennifer King, Siddhartha S. Srinivasa, and Tamim Asfour. 2015. Kinodynamic randomized rearrangement planning via dynamic transitions between statically stable states. In IEEE International Conference on Robotics and Automation. 3075?3082.

[15]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 770?778.

[16]
Paul Henderson and Vittorio Ferrari. 2017. A generative model of 3D object layouts in apartments. CoRR abs/1711.10939 (2017).

[17]
Todd Hester, Matej Vecer?k, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Ian Osband, Gabriel Dulac-Arnold, John P. Agapiou, Joel Z. Leibo, and Audrunas Gruslys. 2018. Deep Q-learning from demonstrations. In 32nd AAAI Conference on Artificial Intelligence. 3223? 3230.

[18]
Ruizhen Hu, Zeyu Huang, Yuhan Tang, Oliver Van Kaick, Hao Zhang, and Hui Huang. 2020. Graph2Plan: Learning floorplan generation from layout graphs. ACM Trans. Graph. 39, 4 (2020), 118:1?118:14.

[19]
Z. Sadeghipour Kermani, Zicheng Liao, Ping Tan, and H. Zhang. 2016. Learning 3D scene synthesis from annotated RGB-D images. In Computer Graphics Forum. 197?206.

[20]
Jennifer E. King, Marco Cognetti, and Siddhartha S. Srinivasa. 2016. Rearrangement planning using object-centric and robot-centric action spaces. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3940?3947.

[21]
Jennifer E. King, Vinitha Ranganeni, and Siddhartha S. Srinivasa. 2017. Unobservable Monte Carlo planning for nonprehensile rearrangement tasks. In IEEE International Conference on Robotics and Automation. 4681?4688.

[22]
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In 5th International Conference on Learning Representations.

[23]
Levente Kocsis and Csaba Szepesv?ri. 2006. Bandit based Monte-Carlo planning. In 17th European Conference on Machine Learning. 282?293.

[24]
Michael C. Koval, Jennifer E. King, Nancy S. Pollard, and Siddhartha S. Srinivasa. 2015. Robust trajectory selection for rearrangement planning as a multi-armed bandit problem. In IEEE/RSJ International Conference on Intelligent Robots and Systems. 2678?2685.

[25]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Conference on Advances in Neural Information Processing Systems. 1106?1114.

[26]
Kurt Leimer, Paul Guerrero, Tomer Weiss, and Przemyslaw Musialski. 2022. LayoutEnhancer: Generating good indoor layouts from imperfect data. In ACM SIGGRAPH Asia Conference (SA?22). 27:1?27:8.

[27]
Sergey Levine, Peter Pastor, Alex Krizhevsky, Julian Ibarz, and Deirdre Quillen. 2018. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Robot. Res. 37, 4-5 (2018), 421?436.

[28]
Manyi Li, Akshay Gadi Patil, Kai Xu, Siddhartha Chaudhuri, Owais Khan, Ariel Shamir, Changhe Tu, Baoquan Chen, Daniel Cohen-Or, and Hao Zhang. 2019. GRAINS: Generative recursive autoencoders for indoor scenes. ACM Trans. Graph. 38, 2 (2019), 12:1?12:16.

[29]
Dong C. Liu and Jorge Nocedal. 1989. On the limited memory BFGS method for large scale optimization. Math. Program. 45, 1?3 (1989), 503?528.

[30]
Jingjing Liu, Wei Liane, Bing Ning, and Ting Mao. 2021a. Work surface arrangement optimization driven by human activity. In IEEE Conference on Virtual Reality and 3D User Interfaces. 270?278.

[31]
Lijuan Liu, Yin Yang, Yi Yuan, Tianjia Shao, He Wang, and Kun Zhou. 2021b. In-game residential home planning via visual context-aware global relation learning. In 35th AAAI Conference on Artificial Intelligence. 336?343.

[32]
Rui Ma, Honghua Li, Changqing Zou, Zicheng Liao, Xin Tong, and Hao Zhang. 2016. Action-driven 3D indoor scene evolution. ACM Trans. Graph. 35, 6 (2016), 173:1?173:13.

[33]
Rui Ma, Akshay Gadi Patil, Matthew Fisher, Manyi Li, S?ren Pirk, Binh-Son Hua, Sai-Kit Yeung, Xin Tong, Leonidas Guibas, and Hao Zhang. 2018. Language-driven synthesis of 3D scenes from scene databases. ACM Trans. Graph. 37, 6 (2018), 212:1?212:16.

[34]
Paul Merrell, Eric Schkufza, Zeyang Li, Maneesh Agrawala, and Vladlen Koltun. 2011. Interactive furniture layout using interior design guidelines. ACM Trans. Graph. 30, 4 (2011), 87:1?87:10.

[35]
Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Wenjie Jiang, Ebrahim M. Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Azade Nazi, Jiwoo Pak, Andy Tong, Kavya Srinivasa, Will Hang, Emre Tuncer, Quoc V. Le, James Laudon, Richard Ho, Roger Carpenter, and Jeff Dean. 2021. A graph placement methodology for fast chip design. Nature 594, 7862 (2021), 207?212.

[36]
Piotr Mirowski, Razvan Pascanu, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharshan Kumaran, and Raia Hadsell. 2017. Learning to navigate in complex environments. In 5th International Conference on Learning Representations.

[37]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin A. Riedmiller. 2013. Playing Atari with deep reinforcement learning. CoRR abs/1312.5602 (2013).

[38]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin A. Riedmiller, Andreas Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529?533.

[39]
Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, and Hao Su. 2019. PartNet: A large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 909?918.

[40]
Nelson Nauata, Kai-Hung Chang, Chin-Yi Cheng, Greg Mori, and Yasutaka Furukawa. 2020. House-GAN: Relational generative adversarial networks for graph-constrained house layout generation. In 16th European Conference on Computer Vision. 162?177.

[41]
Nelson Nauata, Sepidehsadat Hosseini, Kai-Hung Chang, Hang Chu, Chin-Yi Cheng, and Yasutaka Furukawa. 2021. House-GAN++: Generative adversarial layout refinement network towards intelligent computational agent for professional architects. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13632?13641.

[42]
Brendan O?Donoghue, R?mi Munos, Koray Kavukcuoglu, and Volodymyr Mnih. 2017. Combining policy gradient and Q-learning. In 5th International Conference on Learning Representations.

[43]
Julius Panero and Martin Zelnik. 1999. Human Dimension & Interior Space: A Source Book of Design Reference Standards. Whitney Library of Design, New York, NY.

[44]
Despoina Paschalidou, Amlan Kar, Maria Shugrina, Karsten Kreis, Andreas Geiger, and Sanja Fidler. 2021. ATISS: Autoregressive transformers for indoor scene synthesis. In Conference on Advances in Neural Information Processing Systems. 12013?12026.

[45]
Giovanni Pintore, Claudio Mura, Fabio Ganovelli, Lizeth Fuentes-Perez, Renato Pajarola, and Enrico Gobbetti. 2020. State-of-the-art in automatic 3D reconstruction of structured indoor environments. In Computer Graphics Forum. 667?699.

[46]
Siyuan Qi, Yixin Zhu, Siyuan Huang, Chenfanfu Jiang, and Song-Chun Zhu. 2018. Human-centric indoor scene synthesis using stochastic grammar. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5899?5908.

[47]
Daniel Ritchie, Kai Wang, and Yu-An Lin. 2019. Fast and flexible indoor scene synthesis via deep convolutional generative models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6182?6190.

[48]
Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2016. Prioritized experience replay. In 4th International Conference on Learning Representations.

[49]
Shai Shalev-Shwartz and Shai Ben-David. 2014. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.

[50]
David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy P. Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, and Demis Hassabis. 2017. Mastering the game of Go without human knowledge. Nature 550, 7676 (2017), 354?359.

[51]
Haoran Song, Joshua A. Haustein, Weihao Yuan, Kaiyu Hang, Michael Yu Wang, Danica Kragic, and Johannes A. Stork. 2020. Multi-object rearrangement with Monte Carlo tree search: A case study on planar nonprehensile sorting. In IEEE/RSJ International Conference on Intelligent Robots and Systems. 9433?9440.

[52]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. Bradford Books.

[53]
Hado van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In 30th AAAI Conference on Artificial Intelligence. 2094?2100.

[54]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Conference on Advances in Neural Information Processing Systems. 5998-6008.

[55]
Quan Ho Vuong, Yiming Zhang, and Keith W. Ross. 2019. Supervised policy update for deep reinforcement learning. In 7th International Conference on Learning Representations.

[56]
Hanqing Wang, Wei Liang, and Lap-Fai Yu. 2020. Scene mover: Automatic move planning for scene arrangement by deep reinforcement learning. ACM Trans. Graph. 39, 6 (2020), 233:1?233:15.

[57]
Hanqing Wang, Zan Wang, Wei Liang, and Lap-Fai Yu. 2021a. PEARL: Parallelized expert-assisted reinforcement learning for scene rearrangement planning. CoRR abs/2105.04088 (2021).

[58]
Kai Wang, Yu-An Lin, Ben Weissmann, Manolis Savva, Angel X. Chang, and Daniel Ritchie. 2019. PlanIT: Planning and instantiating indoor scenes with relation graph and spatial prior networks. ACM Trans. Graph. 38, 4 (2019), 132:1? 132:15.

[59]
Kai Wang, Manolis Savva, Angel X. Chang, and Daniel Ritchie. 2018. Deep convolutional priors for indoor scene synthesis. ACM Trans. Graph. 37, 4 (2018), 70:1?70:14.

[60]
Xinpeng Wang, Chandan Yeshwanth, and Matthias Nie?ner. 2021b. SceneFormer: Indoor scene generation with transformers. In International Conference on 3D Vision. 106?115.

[61]
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, and Nando de Freitas. 2016. Dueling network architectures for deep reinforcement learning. In 33rd International Conference on Machine Learning, Vol. 48. 1995?2003.

[62]
Gordon Wilfong. 1991. Motion planning in the presence of movable obstacles. Ann. Math. Artif. Intell.1 (1991), 131?150.

[63]
Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, and Hao Su. 2020. SAPIEN: A SimulAted part-based interactive ENvironment. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11094?11104.

[64]
Wenzhuo Xu, Bin Wang, and Dong-Ming Yan. 2015. Wall grid structure for interior scene synthesis. Comput. Graph. 46 (2015), 231?243.

[65]
Haitao Yang, Zaiwei Zhang, Siming Yan, Haibin Huang, Chongyang Ma, Yi Zheng, Chandrajit Bajaj, and Qixing Huang. 2021. Scene synthesis via uncertainty-driven attribute synchronization. In IEEE/CVF International Conference on Computer Vision. 5610?5620.

[66]
Sifan Ye, Yixing Wang, Jiaman Li, Dennis Park, C. Karen Liu, Huazhe Xu, and Jiajun Wu. 2022. Scene synthesis from human motion. In ACM SIGGRAPH Asia Conference (SA?22). 26:1?26:9.

[67]
Yi-Ting Yeh, Lingfeng Yang, Matthew Watson, Noah D. Goodman, and Pat Hanrahan. 2012. Synthesizing open worlds with constraints using locally annealed reversible jump MCMC. ACM Trans. Graph. 31, 4 (2012), 56:1?56:11.

[68]
Lap Fai Yu, Sai Kit Yeung, Chi Keung Tang, Demetri Terzopoulos, Tony F. Chan, and Stanley J. Osher. 2011. Make it home: Automatic optimization of furniture arrangement. ACM Trans. Graph. 30, 4 (2011), 86:1?86:12.

[69]
Weihao Yuan, Kaiyu Hang, Danica Kragic, Michael Y. Wang, and Johannes A. Stork. 2019. End-to-end nonprehensile rearrangement with deep reinforcement learning and simulation-to-reality transfer. Robot. Auton. Syst. 119 (2019), 119?134.

[70]
Weihao Yuan, Johannes A. Stork, Danica Kragic, Michael Y. Wang, and Kaiyu Hang. 2018. Rearrangement with nonprehensile manipulation using deep reinforcement learning. In IEEE International Conference on Robotics and Automation. 270?277.

[71]
Song-Hai Zhang, Shao-Kui Zhang, Yuan Liang, and Peter Hall. 2019. A survey of 3D indoor scene synthesis. J. Comput. Sci. Technol.3 (2019), 594?608.

[72]
Song-Hai Zhang, Shao-Kui Zhang, Wei-Yu Xie, Cheng-Yang Luo, Yongliang Yang, and Hongbo Fu. 2022. Fast 3D indoor scene synthesis by learning spatial relation priors of objects. IEEE Trans. Visualiz. Comput. Graph. 28, 9 (2022), 3082?3092.

[73]
Zaiwei Zhang, Zhenpei Yang, Chongyang Ma, Linjie Luo, Alexander Huth, Etienne Vouga, and Qixing Huang. 2020. Deep generative modeling for scene synthesis via hybrid representations. ACM Trans. Graph. 39, 2 (2020), 17:1?17:21.

[74]
Yang Zhou, Zachary While, and Evangelos Kalogerakis. 2019. SceneGraphNet: Neural message passing for 3D indoor scene augmentation. In IEEE/CVF International Conference on Computer Vision. 7383?7391.

ACM Digital Library Publication:

HAISOR: Human-Aware Indoor Scene Optimization via Deep Reinforcement Learning

Overview Page:

SIGGRAPH 2024: Technical Papers

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES

“HAISOR: Human-Aware Indoor Scene Optimization via Deep Reinforcement Learning”

Conference:

Type(s):

Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Submit a story:

Sponsored by: