“RTG-SLAM: Real-time 3D Reconstruction at Scale Using Gaussian Splatting”
Conference:
Type(s):
Title:
- RTG-SLAM: Real-time 3D Reconstruction at Scale Using Gaussian Splatting
Presenter(s)/Author(s):
Abstract:
RTG-SLAM is a real-time 3D reconstruction system using Gaussian splatting. It is also memory efficient, enabling reconstruction of large-scale environments. Comparisons demonstrate RTG-SLAM runs at around twice the speed of the state-of-the-art, NeRF-based SLAM, with around half the memory cost (e.g., 17.9 fps, 8.8 GB versus 8.65 fps, 17.3 GB).
References:
[1]
Yan-Pei Cao, Leif Kobbelt, and Shi-Min Hu. 2018. Real-Time High-Accuracy Three-Dimensional Reconstruction with Consumer RGB-D Cameras. ACM Trans. Graph. 37, 5, Article 171 (sep 2018), 16 pages. https://doi.org/10.1145/3182157
[2]
Jiawen Chen, Dennis Bautembach, and Shahram Izadi. 2013. Scalable real-time volumetric surface reconstruction. ACM Trans. Graph. 32, 4 (2013), 113:1?113:16. https://doi.org/10.1145/2461912.2461940
[3]
Jaeyoung Chung, Jeongtaek Oh, and Kyoung Mu Lee. 2024. Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images. arxiv:2311.13398 [cs.CV]
[4]
Angela Dai, Matthias Nie?ner, Michael Zollh?fer, Shahram Izadi, and Christian Theobalt. 2017. BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration. ACM Trans. Graph. 36, 4, Article 76a (jul 2017), 18 pages. https://doi.org/10.1145/3072959.3054739
[5]
Hao Du, Peter Henry, Xiaofeng Ren, Marvin Cheng, Dan B. Goldman, Steven M. Seitz, and Dieter Fox. 2011. Interactive 3D modeling of indoor environments with a consumer depth camera. In UbiComp 2011: Ubiquitous Computing, 13th International Conference, UbiComp 2011, Beijing, China, September 17-21, 2011, Proceedings, James A. Landay, Yuanchun Shi, Donald J. Patterson, Yvonne Rogers, and Xing Xie (Eds.). ACM, 75?84. https://doi.org/10.1145/2030112.2030123
[6]
Huajian Huang, Longwei Li, Hui Cheng, and Sai-Kit Yeung. 2023b. Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras. CoRR abs/2311.16728 (2023). https://doi.org/10.48550/ARXIV.2311.16728 arXiv:2311.16728
[7]
Jiahui Huang, Shi-Sheng Huang, Haoxuan Song, and Shi-Min Hu. 2021. DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[8]
Shi-Sheng Huang, Haoxiang Chen, Jiahui Huang, Hongbo Fu, and Shi-Min Hu. 2023a. Real-Time Globally Consistent 3D Reconstruction With Semantic Priors. IEEE Transactions on Visualization and Computer Graphics 29, 4 (2023), 1977?1991. https://doi.org/10.1109/TVCG.2021.3137912
[9]
Mohammad Mahdi Johari, Camilla Carta, and Fran?ois Fleuret. 2023. ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. IEEE, 17408?17419. https://doi.org/10.1109/CVPR52729.2023.01670
[10]
Nikhil Varma Keetha, Jay Karhade, Krishna Murthy Jatavallabhula, Gengshan Yang, Sebastian A. Scherer, Deva Ramanan, and Jonathon Luiten. 2023. SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM. CoRR abs/2312.02126 (2023). https://doi.org/10.48550/ARXIV.2312.02126 arXiv:2312.02126
[11]
Maik Keller, Damien Lefloch, Martin Lambers, Shahram Izadi, Tim Weyrich, and Andreas Kolb. 2013. Real-Time 3D Reconstruction in Dynamic Scenes Using Point-Based Fusion. In 2013 International Conference on 3D Vision, 3DV 2013, Seattle, Washington, USA, June 29 – July 1, 2013. IEEE Computer Society, 1?8. https://doi.org/10.1109/3DV.2013.9
[12]
Bernhard Kerbl, Georgios Kopanas, Thomas Leimk?hler, and George Drettakis. 2023. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Trans. Graph. 42, 4 (2023), 139:1?139:14. https://doi.org/10.1145/3592433
[13]
Hidenobu Matsuki, Riku Murai, Paul H. J. Kelly, and Andrew J. Davison. 2023. Gaussian Splatting SLAM. arxiv:2312.06741 [cs.CV]
[14]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In Computer Vision – ECCV 2020 – 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part I(Lecture Notes in Computer Science, Vol. 12346), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 405?421. https://doi.org/10.1007/978-3-030-58452-8_24
[15]
Ra?l Mur-Artal and Juan D. Tard?s. 2017. ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras. IEEE Transactions on Robotics 33, 5 (2017), 1255?1262. https://doi.org/10.1109/TRO.2017.2705103
[16]
Richard A. Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J. Davison, Pushmeet Kohli, Jamie Shotton, Steve Hodges, and Andrew W. Fitzgibbon. 2011. KinectFusion: Real-time dense surface mapping and tracking. In 10th IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2011, Basel, Switzerland, October 26-29, 2011. IEEE Computer Society, 127?136. https://doi.org/10.1109/ISMAR.2011.6092378
[17]
Matthias Nie?ner, Michael Zollh?fer, Shahram Izadi, and Marc Stamminger. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. Graph. 32, 6 (2013), 169:1?169:11. https://doi.org/10.1145/2508363.2508374
[18]
Erik Sandstr?m, Yue Li, Luc Van Gool, and Martin R. Oswald. 2023. Point-SLAM: Dense Neural Point Cloud-based SLAM. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
[19]
Thomas Sch?ps, Torsten Sattler, and Marc Pollefeys. 2019. BAD SLAM: Bundle Adjusted Direct RGB-D SLAM. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 134?144. https://doi.org/10.1109/CVPR.2019.00022
[20]
Frank Steinbr?cker, Christian Kerl, and Daniel Cremers. 2013. Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences. In IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1-8, 2013. IEEE Computer Society, 3264?3271. https://doi.org/10.1109/ICCV.2013.405
[21]
Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J. Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma, Anton Clarkson, Mingfei Yan, Brian Budge, Yajie Yan, Xiaqing Pan, June Yon, Yuyang Zou, Kimberly Leon, Nigel Carter, Jesus Briales, Tyler Gillingham, Elias Mueggler, Luis Pesqueira, Manolis Savva, Dhruv Batra, Hauke M. Strasdat, Renzo De Nardi, Michael Goesele, Steven Lovegrove, and Richard Newcombe. 2019. The Replica Dataset: A Digital Replica of Indoor Spaces. arXiv preprint arXiv:1906.05797 (2019).
[22]
J?rgen Sturm, Nikolas Engelhard, Felix Endres, Wolfram Burgard, and Daniel Cremers. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. 573?580. https://doi.org/10.1109/IROS.2012.6385773
[23]
Edgar Sucar, Shikun Liu, Joseph Ortiz, and Andrew J. Davison. 2021a. iMAP: Implicit Mapping and Positioning in Real-Time. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. IEEE, 6209?6218. https://doi.org/10.1109/ICCV48922.2021.00617
[24]
Edgar Sucar, Shikun Liu, Joseph Ortiz, and Andrew J. Davison. 2021b. iMAP: Implicit Mapping and Positioning in Real-Time. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. IEEE, 6209?6218. https://doi.org/10.1109/ICCV48922.2021.00617
[25]
Hengyi Wang, Jingwen Wang, and Lourdes Agapito. 2023. Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. IEEE, 13293?13302. https://doi.org/10.1109/CVPR52729.2023.01277
[26]
Thomas Whelan, Stefan Leutenegger, Renato F Salas-Moreno, Ben Glocker, and Andrew J Davison. 2015. ElasticFusion: Dense SLAM without a pose graph. In Robotics: science and systems, Vol. 11. Rome, Italy, 3.
[27]
Yabin Xu, Liangliang Nan, Laishui Zhou, Jun Wang, and Charlie C. L. Wang. 2022. HRBF-Fusion: Accurate 3D Reconstruction from RGB-D Data Using On-the-fly Implicits. ACM Trans. Graph. 41, 3, Article 35 (apr 2022), 19 pages. https://doi.org/10.1145/3516521
[28]
Chi Yan, Delin Qu, Dong Wang, Dan Xu, Zhigang Wang, Bin Zhao, and Xuelong Li. 2023. GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting. CoRR abs/2311.11700 (2023). https://doi.org/10.48550/ARXIV.2311.11700 arXiv:2311.11700
[29]
Xingrui Yang, Hai Li, Hongjia Zhai, Yuhang Ming, Yuqian Liu, and Guofeng Zhang. 2022. Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit Representation. In IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2022, Singapore, October 17-21, 2022, Henry B. L. Duh, Ian Williams, Jens Grubert, J. Adam Jones, and Jianmin Zheng (Eds.). IEEE, 499?507. https://doi.org/10.1109/ISMAR55827.2022.00066
[30]
Ziyi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, and Xiaogang Jin. 2023. Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction. arXiv preprint arXiv:2309.13101 (2023).
[31]
Chandan Yeshwanth, Yueh-Cheng Liu, Matthias Nie?ner, and Angela Dai. 2023. ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes. In Proceedings of the International Conference on Computer Vision (ICCV).
[32]
Vladimir Yugay, Yue Li, Theo Gevers, and Martin R. Oswald. 2023. Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting. arxiv:2312.10070 [cs.CV]
[33]
Yizhong Zhang, Weiwei Xu, Yiying Tong, and Kun Zhou. 2015. Online Structure Analysis for Real-Time Indoor Scene Reconstruction. ACM Trans. Graph. 34, 5 (2015), 159:1?159:13. https://doi.org/10.1145/2768821
[34]
Zihan Zhu, Songyou Peng, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R. Oswald, and Marc Pollefeys. 2022a. NICE-SLAM: Neural Implicit Scalable Encoding for SLAM. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35]
Zihan Zhu, Songyou Peng, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R. Oswald, and Marc Pollefeys. 2022b. NICE-SLAM: Neural Implicit Scalable Encoding for SLAM. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. IEEE, 12776?12786. https://doi.org/10.1109/CVPR52688.2022.01245
[36]
Michael Zollh?fer, Patrick Stotko, Andreas G?rlitz, Christian Theobalt, Matthias Nie?ner, Reinhard Klein, and Andreas Kolb. 2018. State of the Art on 3D Reconstruction with RGB-D Cameras. Comput. Graph. Forum 37, 2 (2018), 625?652. https://doi.org/10.1111/CGF.13386
[37]
Matthias Zwicker, Hanspeter Pfister, Jeroen van Baar, and Markus Gross. 2001. Surface splatting. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques(SIGGRAPH ?01). Association for Computing Machinery, New York, NY, USA, 371?378. https://doi.org/10.1145/383259.383300