Dense scene reconstruction with points of interest

We present an approach to detailed reconstruction of complex real-world scenes with a handheld commodity range sensor. The user moves the sensor freely through the environment and images the scene. An offline registration and integration pipeline produces a detailed scene model. To deal with the complex sensor trajectories required to produce detailed reconstructions with a consumer-grade sensor, our pipeline detects points of interest in the scene and preserves detailed geometry around them while a global optimization distributes residual registration errors through the environment. Our results demonstrate that detailed reconstructions of complex scenes can be obtained with a consumer-grade camera.

References:

1. Agarwal, S., Snavely, N., Seitz, S. M., and Szeliski, R. 2010. Bundle adjustment in the large. In Proc. ECCV. Google ScholarDigital Library
2. Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 2001. Google ScholarDigital Library
3. Brown, B. J., and Rusinkiewicz, S. 2007. Global non-rigid alignment of 3-D scans. ACM Transactions on Graphics 26, 3. Google ScholarDigital Library
4. Chen, Y., and Medioni, G. G. 1992. Object modelling by registration of multiple range images. Image and Vision Computing 10, 3. Google ScholarDigital Library
5. Comaniciu, D., and Meer, P. 2002. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 5. Google ScholarDigital Library
6. Cui, Y., Schuon, S., Chan, D., Thrun, S., and Theobalt, C. 2010. 3D shape scanning with a time-of-flight camera. In Proc. CVPR.Google Scholar
7. Curless, B., and Levoy, M. 1996. A volumetric method for building complex models from range images. In Proc. SIGGRAPH. Google ScholarDigital Library
8. Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., and Burgard, W. 2012. An evaluation of the RGB-D SLAM system. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
9. Fuhrmann, S., and Goesele, M. 2011. Fusion of depth maps with multiple scales. ACM Transactions on Graphics 30, 6. Google ScholarDigital Library
10. Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 8. Google ScholarDigital Library
11. Furukawa, Y., Curless, B., Seitz, S. M., and Szeliski, R. 2010. Towards Internet-scale multi-view stereo. In Proc. CVPR.Google Scholar
12. Goesele, M., Snavely, N., Curless, B., Hoppe, H., and Seitz, S. M. 2007. Multi-view stereo for community photo collections. In Proc. ICCV.Google Scholar
13. Henry, P., Krainin, M., Herbst, E., Ren, X., and Fox, D. 2012. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. International Journal of Robotics Research 31, 5. Google ScholarDigital Library
14. Heredia, F., and Favier, R. 2012. Kinect Fusion extensions to large scale environments. http://www.pointclouds.org/blog/srcs/fheredia.Google Scholar
15. Huber, D. F., and Hebert, M. 2003. Fully automatic registration of multiple 3D data sets. Image and Vision Computing 21, 7.Google ScholarCross Ref
16. Khoshelham, K., and Elberink, S. O. 2012. Accuracy and resolution of Kinect depth data for indoor mapping applications. Sensors 12, 2.Google ScholarCross Ref
17. Kummerle, R., Grisetti, G., Strasdat, H., Konolige, K., and Burgard, W. 2011. g2o: A general framework for graph optimization. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
18. Microsoft. 2010. Kinect. http://www.xbox.com/en-us/kinect.Google Scholar
19. Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Google ScholarDigital Library
20. Pollefeys, M., Gool, L. J. V., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., and Koch, R. 2004. Visual modeling with a hand-held camera. International Journal of Computer Vision 59, 3. Google ScholarDigital Library
21. Pollefeys, M., Nistér, D., Frahm, J.-M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S. J., Merrell, P., Salmi, C., Sinha, S. N., Talton, B., Wang, L., Yang, Q., Stewénius, H., Yang, R., Welch, G., and Towles, H. 2008. Detailed real-time urban 3D reconstruction from video. International Journal of Computer Vision 78, 2–3. Google ScholarDigital Library
22. PrimeSense. 2012. PrimeSense unveils Capri. http://www.primesense.com/news/primesense-unveils-capri/.Google Scholar
23. Pulli, K. 1999. Multiview registration for large data sets. In Proc. International Conference on 3D Digital Imaging and Modeling (3DIM). Google ScholarDigital Library
24. Roth, H., and Vona, M. 2012. Moving volume KinectFusion. In British Machine Vision Conference (BMVC).Google Scholar
25. Ruhnke, M., Kümmerle, R., Grisetti, G., and Burgard, W. 2012. Highly accurate 3D surface models by sparse surface adjustment. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
26. Rusinkiewicz, S., Hall-Holt, O., and Levoy, M. 2002. Real-time 3D model acquisition. ACM Transactions on Graphics 21, 3. Google ScholarDigital Library
27. Rusu, R. B., and Cousins, S. 2011. 3D is here: Point Cloud Library (PCL). In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
28. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proc. CVPR. Google ScholarDigital Library
29. Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cremers, D. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In International Conference on Intelligent Robot Systems (IROS).Google Scholar
30. Triggs, B., Mclauchlan, P., Hartley, R., and Fitzgibbon, A. 2000. Bundle adjustment — a modern synthesis. In Vision Algorithms: Theory and Practice. Google ScholarDigital Library
31. Troccoli, A., and Allen, P. K. 2008. Building illumination coherent 3D models of large-scale outdoor scenes. International Journal of Computer Vision 78, 2–3. Google ScholarDigital Library
32. Turk, G., and Levoy, M. 1994. Zippered polygon meshes from range images. In Proc. SIGGRAPH. Google ScholarDigital Library
33. Weise, T., Wismer, T., Leibe, B., and Gool, L. V. 2011. Online loop closure for real-time interactive 3D scanning. Computer Vision and Image Understanding 115, 5. Google ScholarDigital Library
34. Weyrich, T., Lawrence, J., Lensch, H. P. A., Rusinkiewicz, S., and Zickler, T. 2009. Principles of appearance acquisition and representation. Foundations and Trends in Computer Graphics and Vision 4, 2. Google ScholarDigital Library
35. Whelan, T., Johannsson, H., Kaess, M., Leonard, J., and McDonald, J. 2013. Robust real-time visual odometry for dense RGB-D mapping. In IEEE International Conference on Robotics and Automation (ICRA).Google ScholarCross Ref
36. Williams, B. P., Cummins, M., Neira, J., Newman, P. M., Reid, I. D., and Tardós, J. D. 2009. A comparison of loop closing techniques in monocular SLAM. Robotics and Autonomous Systems 57, 12. Google ScholarDigital Library
37. Wu, C., Agarwal, S., Curless, B., and Seitz, S. M. 2011. Multicore bundle adjustment. In Proc. CVPR. Google ScholarDigital Library
38. Zeng, M., Zhao, F., Zheng, J., and Liu, X. 2013. Octree-based fusion for realtime 3D reconstruction. Graphical Models 75, 3. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2013: Technical Papers

“Dense scene reconstruction with points of interest” by Zhou and Koltun

Conference:

Type(s):

Title:

Session/Category Title: Surface Reconstruction

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: