“Dense scene reconstruction with points of interest” by Zhou and Koltun

  • ©Qian-Yi Zhou and Vladlen Koltun




    Dense scene reconstruction with points of interest


Session Title: Surface Reconstruction



    We present an approach to detailed reconstruction of complex real-world scenes with a handheld commodity range sensor. The user moves the sensor freely through the environment and images the scene. An offline registration and integration pipeline produces a detailed scene model. To deal with the complex sensor trajectories required to produce detailed reconstructions with a consumer-grade sensor, our pipeline detects points of interest in the scene and preserves detailed geometry around them while a global optimization distributes residual registration errors through the environment. Our results demonstrate that detailed reconstructions of complex scenes can be obtained with a consumer-grade camera.


    1. Agarwal, S., Snavely, N., Seitz, S. M., and Szeliski, R. 2010. Bundle adjustment in the large. In Proc. ECCV. Google ScholarDigital Library
    2. Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 2001. Google ScholarDigital Library
    3. Brown, B. J., and Rusinkiewicz, S. 2007. Global non-rigid alignment of 3-D scans. ACM Transactions on Graphics 26, 3. Google ScholarDigital Library
    4. Chen, Y., and Medioni, G. G. 1992. Object modelling by registration of multiple range images. Image and Vision Computing 10, 3. Google ScholarDigital Library
    5. Comaniciu, D., and Meer, P. 2002. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 5. Google ScholarDigital Library
    6. Cui, Y., Schuon, S., Chan, D., Thrun, S., and Theobalt, C. 2010. 3D shape scanning with a time-of-flight camera. In Proc. CVPR.Google Scholar
    7. Curless, B., and Levoy, M. 1996. A volumetric method for building complex models from range images. In Proc. SIGGRAPH. Google ScholarDigital Library
    8. Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., and Burgard, W. 2012. An evaluation of the RGB-D SLAM system. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
    9. Fuhrmann, S., and Goesele, M. 2011. Fusion of depth maps with multiple scales. ACM Transactions on Graphics 30, 6. Google ScholarDigital Library
    10. Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 8. Google ScholarDigital Library
    11. Furukawa, Y., Curless, B., Seitz, S. M., and Szeliski, R. 2010. Towards Internet-scale multi-view stereo. In Proc. CVPR.Google Scholar
    12. Goesele, M., Snavely, N., Curless, B., Hoppe, H., and Seitz, S. M. 2007. Multi-view stereo for community photo collections. In Proc. ICCV.Google Scholar
    13. Henry, P., Krainin, M., Herbst, E., Ren, X., and Fox, D. 2012. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. International Journal of Robotics Research 31, 5. Google ScholarDigital Library
    14. Heredia, F., and Favier, R. 2012. Kinect Fusion extensions to large scale environments. http://www.pointclouds.org/blog/srcs/fheredia.Google Scholar
    15. Huber, D. F., and Hebert, M. 2003. Fully automatic registration of multiple 3D data sets. Image and Vision Computing 21, 7.Google ScholarCross Ref
    16. Khoshelham, K., and Elberink, S. O. 2012. Accuracy and resolution of Kinect depth data for indoor mapping applications. Sensors 12, 2.Google ScholarCross Ref
    17. Kummerle, R., Grisetti, G., Strasdat, H., Konolige, K., and Burgard, W. 2011. g2o: A general framework for graph optimization. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
    18. Microsoft. 2010. Kinect. http://www.xbox.com/en-us/kinect.Google Scholar
    19. Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Google ScholarDigital Library
    20. Pollefeys, M., Gool, L. J. V., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., and Koch, R. 2004. Visual modeling with a hand-held camera. International Journal of Computer Vision 59, 3. Google ScholarDigital Library
    21. Pollefeys, M., Nistér, D., Frahm, J.-M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S. J., Merrell, P., Salmi, C., Sinha, S. N., Talton, B., Wang, L., Yang, Q., Stewénius, H., Yang, R., Welch, G., and Towles, H. 2008. Detailed real-time urban 3D reconstruction from video. International Journal of Computer Vision 78, 2–3. Google ScholarDigital Library
    22. PrimeSense. 2012. PrimeSense unveils Capri. http://www.primesense.com/news/primesense-unveils-capri/.Google Scholar
    23. Pulli, K. 1999. Multiview registration for large data sets. In Proc. International Conference on 3D Digital Imaging and Modeling (3DIM). Google ScholarDigital Library
    24. Roth, H., and Vona, M. 2012. Moving volume KinectFusion. In British Machine Vision Conference (BMVC).Google Scholar
    25. Ruhnke, M., Kümmerle, R., Grisetti, G., and Burgard, W. 2012. Highly accurate 3D surface models by sparse surface adjustment. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
    26. Rusinkiewicz, S., Hall-Holt, O., and Levoy, M. 2002. Real-time 3D model acquisition. ACM Transactions on Graphics 21, 3. Google ScholarDigital Library
    27. Rusu, R. B., and Cousins, S. 2011. 3D is here: Point Cloud Library (PCL). In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
    28. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proc. CVPR. Google ScholarDigital Library
    29. Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cremers, D. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In International Conference on Intelligent Robot Systems (IROS).Google Scholar
    30. Triggs, B., Mclauchlan, P., Hartley, R., and Fitzgibbon, A. 2000. Bundle adjustment — a modern synthesis. In Vision Algorithms: Theory and Practice. Google ScholarDigital Library
    31. Troccoli, A., and Allen, P. K. 2008. Building illumination coherent 3D models of large-scale outdoor scenes. International Journal of Computer Vision 78, 2–3. Google ScholarDigital Library
    32. Turk, G., and Levoy, M. 1994. Zippered polygon meshes from range images. In Proc. SIGGRAPH. Google ScholarDigital Library
    33. Weise, T., Wismer, T., Leibe, B., and Gool, L. V. 2011. Online loop closure for real-time interactive 3D scanning. Computer Vision and Image Understanding 115, 5. Google ScholarDigital Library
    34. Weyrich, T., Lawrence, J., Lensch, H. P. A., Rusinkiewicz, S., and Zickler, T. 2009. Principles of appearance acquisition and representation. Foundations and Trends in Computer Graphics and Vision 4, 2. Google ScholarDigital Library
    35. Whelan, T., Johannsson, H., Kaess, M., Leonard, J., and McDonald, J. 2013. Robust real-time visual odometry for dense RGB-D mapping. In IEEE International Conference on Robotics and Automation (ICRA).Google ScholarCross Ref
    36. Williams, B. P., Cummins, M., Neira, J., Newman, P. M., Reid, I. D., and Tardós, J. D. 2009. A comparison of loop closing techniques in monocular SLAM. Robotics and Autonomous Systems 57, 12. Google ScholarDigital Library
    37. Wu, C., Agarwal, S., Curless, B., and Seitz, S. M. 2011. Multicore bundle adjustment. In Proc. CVPR. Google ScholarDigital Library
    38. Zeng, M., Zhao, F., Zheng, J., and Liu, X. 2013. Octree-based fusion for realtime 3D reconstruction. Graphical Models 75, 3. Google ScholarDigital Library

ACM Digital Library Publication: