“Object-aware guidance for autonomous scene reconstruction” by Liu, Xia, Sun, Shen, Xu, et al. …

  • ©Ligang Liu, Xi Xia, Han Sun, Qi Shen, Juzhan Xu, Bin Chen, Hui Huang, and Kai Xu



Entry Number: 104


    Object-aware guidance for autonomous scene reconstruction


Session Title: 3D Capture



    To carry out autonomous 3D scanning and online reconstruction of unknown indoor scenes, one has to find a balance between global exploration of the entire scene and local scanning of the objects within it. In this work, we propose a novel approach, which provides object-aware guidance for autoscanning, for exploring, reconstructing, and understanding an unknown scene within one navigation pass. Our approach interleaves between object analysis to identify the next best object (NBO) for global exploration, and object-aware information gain analysis to plan the next best view (NBV) for local scanning. First, an objectness-based segmentation method is introduced to extract semantic objects from the current scene surface via a multi-class graph cuts minimization. Then, an object of interest (OOI) is identified as the NBO which the robot aims to visit and scan. The robot then conducts fine scanning on the OOI with views determined by the NBV strategy. When the OOI is recognized as a full object, it can be replaced by its most similar 3D model in a shape database. The algorithm iterates until all of the objects are recognized and reconstructed in the scene. Various experiments and comparisons have shown the feasibility of our proposed approach.


    1. Sameer Agarwal, Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2010. Bundle Adjustment in the Large. In European Conference on Computer Vision. 29–42. Google ScholarDigital Library
    2. Bogdan Alexe, Thomas Deselaers, and Vittorio Ferrari. 2012. Measuring the objectness of image windows. IEEE transactions on pattern analysis and machine intelligence 34, 11 (2012), 2189–2202. Google ScholarDigital Library
    3. Iro Armeni, Sasha Sax, Amir R Zamir, and Silvio Savarese. 2017. Joint 2D-3D-Semantic Data for Indoor Scene Understanding. arXiv preprint arXiv:1702.01105 (2017).Google Scholar
    4. Benjamin Charrow, Gregory Kahn, Sachin Patil, Sikang Liu, Ken Goldberg, Pieter Abbeel, Nathan Michael, and Vijay Kumar. 2015. Information-theoretic planning with trajectory optimization for dense 3D mapping. In Proceedings of Robotics: Science and Systems.Google ScholarCross Ref
    5. Xiaobai Chen, Aleksey Golovinskiy, and Thomas Funkhouser. 2013. A Benchmark for 3D Mesh Segmentation. ACM Trans. on Graph. (SIGGRAPH) 28, 3 (2013), 73:1–73:12. Google ScholarDigital Library
    6. Sungjoon Choi, Qian-Yi Zhou, and Vladlen Koltun. 2015. Robust reconstruction of indoor scenes. In Proc. CVPR. 5556–5565.Google Scholar
    7. Angela Dai, Angel X Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. 2017. Scannet: Richly-annotated 3d reconstructions of indoor scenes. arXiv preprint arXiv:1702.04405 (2017).Google Scholar
    8. Nikolas Engelhard, Felix Endres, Jürgen Hess Jürgen Sturm, and Wolfram Burgard. 2011. Real-time 3D visual SLAM with a hand-held RGB-D camera. In Proc. of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum, Vasteras, Sweden, Vol 180. 1–15.Google Scholar
    9. Xinyi Fan, Linguang Zhang, Benedict Brown, and Szymon Rusinkiewicz. 2016. Automated View and Path Planning for Scalable Multi-Object 3D Scanning. ACM Trans. on Graph. (SIGGRAPH Asia) 35, 6 (2016), 239. Google ScholarDigital Library
    10. Matthew Fisher, Daniel Ritchie, Manolis Savva, Thomas Funkhouser, and Pat Hanrahan. 2012. Example-based synthesis of 3D object arrangements. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6 (2012), 135:1–135:11. Google ScholarDigital Library
    11. Gazebo. 2013. The Gazebo Project, http://wiki.ros.org/gazebo. (2013).Google Scholar
    12. Armin Hornung, Kai M Wurm, Maren Bennewitz, Cyrill Stachniss, and Wolfram Burgard. 2013. OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots 34, 3 (2013), 189–206. Google ScholarDigital Library
    13. M Krainin, B Curless, and D Fox. 2012. Autonomous generation of complete 3D object models using next best view manipulation planning. In IEEE International Conference on Robotics and Automation. 5031–5037.Google Scholar
    14. Simon Kriegel, Christian Rink, Tim Bodenmüller, Alexander Narr, Michael Suppa, and Gerd Hirzinger. 2012. Next-best-scan planning for autonomous 3D modeling. In Proc. IROS. 2850–2856.Google ScholarCross Ref
    15. S. Lloyd. 1982. Least squared quantization in pcm. IEEE Transactions on Information Theory 28, 2 (1982), 129–137. Google ScholarDigital Library
    16. Kok-Lim Low and Anselmo Lastra. 2006. An adaptive hierarchical next-best-view algorithm for 3d reconstruction of indoor scenes. In Proceedings of 14th Pacific Conference on Computer Graphics and Applications (Pacific Graphics 2006).Google Scholar
    17. Liangliang Nan, Ke Xie, and Andrei Sharf. 2012. A Search-classify Approach for Cluttered Indoor Scene Understanding. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6 (2012), 137:1–137:10. Google ScholarDigital Library
    18. Richard A. Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J. Davison, Pushmeet Kohli, Jamie Shotton, Steve Hodges, and Andrew Fitzgibbon. 2012. KinectFusion: Real-time dense surface mapping and tracking. In IEEE International Symposium on Mixed and Augmented Reality. 127–136. Google ScholarDigital Library
    19. Charles R Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. arXiv preprint arXiv:1706.02413 (2017).Google Scholar
    20. Manikandasriram Srinivasan Ramanagopal and Jerome Le Ny. 2016. Motion Planning Strategies for Autonomously Mapping 3D Structures. arXiv preprint arXiv:1602.06667 (2016).Google Scholar
    21. Renato F. Salas-Moreno, Richard A. Newcombe, Hauke Strasdat, Paul H. J. Kelly, and Andrew J. Davison. 2012. SLAM++: Simultaneous Localisation and Mapping at the Level of Objects. In CVPR. 1352–1359. Google ScholarDigital Library
    22. Shuran Song, Fisher Yu, Andy Zeng, Angel X Chang, Manolis Savva, and Thomas Funkhouser. 2017. Semantic Scene Completion from a Single Depth Image. In Proc. CVPR.Google ScholarCross Ref
    23. Keisuke Tateno, Federico Tombari, and Nassir Navab. 2015. Real-time and scalable incremental segmentation on dense SLAM. In Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. IEEE, 4465–4472.Google ScholarCross Ref
    24. Sebastian Thrun. 2002. Robotic mapping: a survey. Morgan Kaufmann Publishers Inc. 2002 pages.Google Scholar
    25. Julien Valentin, Vibhav Vineet, Ming-Ming Cheng, David Kim, Jamie Shotton, Pushmeet Kohli, Matthias Nießner, Antonio Criminisi, Shahram Izadi, and Philip Torr. 2015. SemanticPaint: Interactive 3D Labeling and Learning at your Finger tips. ACM Trans. on Graph. 34, 5 (2015). Google ScholarDigital Library
    26. Thomas Whelan, Stefan Leutenegger, Renato F Salas-Moreno, Ben Glocker, and Andrew J Davison. 2015. ElasticFusion: Dense SLAM without a pose graph. In Proc. Robotics: Science and Systems.Google ScholarCross Ref
    27. Shihao Wu, Wei Sun, Pinxin Long, Hui Huang, Daniel Cohen-Or, Minglun Gong, Oliver Deussen, and Baoquan Chen. 2014. Quality-driven Poisson-guided autoscanning. ACM Trans. on Graph. (SIGGRAPH Asia) 33, 6 (2014), 203. Google ScholarDigital Library
    28. Kai Xu, Hui Huang, Yifei Shi, Hao Li, Pinxin Long, Jiannong Caichen, Wei Sun, and Baoquan Chen. 2015. Autoscanning for Coupled Scene Reconstruction and Proactive Object Analysis. ACM Trans. on Graph. 34, 6 (2015), 177. Google ScholarDigital Library
    29. Kai Xu, Yifei Shi, Lintao Zheng, Junyu Zhang, Min Liu, Hui Huang, Hao Su, Daniel Cohen-Or, and Baoquan Chen. 2016. 3D Attention-Driven Depth Acquisition for Object Identification. ACM Trans. on Graph. (SIGGRAPH Asia) 35, 6 (2016), 238. Google ScholarDigital Library
    30. Kai Xu, Lintao Zheng, Zihao Yan, Guohang Yan, Eugene Zhang, Matthias Niessner, Oliver Deussen, Daniel Cohen-Or, and Hui Huang. 2017. Autonomous Reconstruction of Unknown Indoor Scenes Guided by Time-varying Tensor Fields. ACM Trans. on Graph. (SIGGRAPH Asia) 36, 6 (2017), 202:1–15. Google ScholarDigital Library
    31. Andy Zeng, Shuran Song, Matthias Nießner, Matthew Fisher, Jianxiong Xiao, and Thomas Funkhouser. 2016. 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions. arXiv preprint arXiv:1603.08182 (2016).Google Scholar
    32. Yizhong Zhang, Weiwei Xu, Yiying Tong, and Kun Zhou. 2015. Online Structure Analysis for Real-Time Indoor Scene Reconstruction. Acm Transactions on Graphics 34, 5 (2015), 159. Google ScholarDigital Library

ACM Digital Library Publication: