“Plan3D: Viewpoint and Trajectory Optimization for Aerial Multi-View Stereo Reconstruction” by Hepp, Niessner and Hilliges

  • ©Benjamin Hepp, Matthias Niessner, and Otmar Hilliges



Session Title:

    Scene and Object Reconstruction


    Plan3D: Viewpoint and Trajectory Optimization for Aerial Multi-View Stereo Reconstruction



    We introduce a new method that efficiently computes a set of viewpoints and trajectories for high-quality 3D reconstructions in outdoor environments. Our goal is to automatically explore an unknown area and obtain a complete 3D scan of a region of interest (e.g., a large building). Images from a commodity RGB camera, mounted on an autonomously navigated quadcopter, are fed into a multi-view stereo reconstruction pipeline that produces high-quality results but is computationally expensive. In this setting, the scanning result is constrained by the restricted flight time of quadcopters. To this end, we introduce a novel optimization strategy that respects these constraints by maximizing the information gain from sparsely sampled viewpoints while limiting the total travel distance of the quadcopter. At the core of our method lies a hierarchical volumetric representation that allows the algorithm to distinguish between unknown, free, and occupied space. Furthermore, our information gain-based formulation leverages this representation to handle occlusions in an efficient manner. In addition to the surface geometry, we utilize free-space information to avoid obstacles and determine collision-free flight paths. Our tool can be used to specify the region of interest and to plan trajectories. We demonstrate our method by obtaining a number of compelling 3D reconstructions, and we provide a thorough quantitative evaluation showing improvement over previous state-of-the-art and regular patterns.


    1. Sameer Agarwal, Noah Snavely, Steven Seitz, and Richard Szeliski. 2010. Bundle adjustment in the large. In Proceedings of the European Conference on Computer Vision (ECCV’10). Springer-Verlag Berlin Heidelberg, 29–42.
    2. Sameer Agarwal, Noah Snavely, Ian Simon, Steven M. Seitz, and Richard Szeliski. 2009. Building rome in a day. In 2009 IEEE 12th International Conference on Computer Vision. IEEE, 72–79.
    3. LLC Agisoft. 2014. Agisoft PhotoScan User Manual: Professional Edition.
    4. Andreas Bircher, Kostas Alexis, Michael Burri, Philipp Oettershagen, Sammy Omari, Thomas Mantel, and Roland Siegwart. 2015. Structural inspection path planning via iterative viewpoint resampling with application to aerial robotics. In 2015 IEEE International Conference on Robotics and Automation (ICRA’15). IEEE, 6423–6430.
    5. Chandra Chekuri and Martin Pal. 2005. A recursive greedy algorithm for walks in directed graphs. In 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05). IEEE, 245–253.
    6. Jiawen Chen, Dennis Bautembach, and Shahram Izadi. 2013. Scalable real-time volumetric surface reconstruction. ACM Transactions on Graphics (TOG) 32, 4 (2013), 113.
    7. Shengyong Chen, Youfu Li, and Ngai Ming Kwok. 2011. Active vision in robotic systems: A survey of recent developments. The International Journal of Robotics Research 30, 11 (2011), 1343–1377.
    8. R. Craig Coulter. 1992. Implementation of the Pure Pursuit Path Tracking AlgorithmTechnical Report. Carnegie-Mellon UNIV Pittsburgh PA Robotics INST.
    9. Brian Curless and Marc Levoy. 1996. A volumetric method for building complex models from range images. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques. ACM, 303–312.
    10. Angela Dai, Matthias Nießner, Michael Zollhöfer, Shahram Izadi, and Christian Theobalt. 2016. BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface re-integration. arXiv Preprint arXiv:1604.01093.
    11. Jianhao Du, Craig Mouser, and Weihua Sheng. 2016. Design and evaluation of a teleoperated robotic 3-d mapping system using an rgb-d sensor. IEEE Transactions on Systems, Man, and Cybernetics: Systems 46, 5 (2016), 718–724.
    12. Enrique Dunn and Jan-Michael Frahm. 2009. Next best view planning for active model improvement. In BMVC. 1–11.
    13. Xinyi Fan, Linguang Zhang, Benedict Brown, and Szymon Rusinkiewicz. 2016. Automated view and path planning for scalable multi-object 3D scanning. ACM Transactions on Graphics (TOG) 35, 6 (2016), 239.
    14. H. Farid, S. Lee, and R. Bajcsy. 1994. View Selection Strategies for Multi-view, Wide-base Stereo. Technical Report MS-CIS-94-18, University of Pennsylvania.
    15. Christian Forster, Matia Pizzoli, and Davide Scaramuzza. 2014. Appearance-based active, monocular, dense reconstruction for micro aerial vehicles. In Robotics: Science and Systems (RSS).
    16. C. S. Fraser. 1984. Network design considerations for non-topographic photogrammetry. Photogrammetric Engineering and Remote Sensing 50, 8 (1984), 1115–1126.
    17. Friedrich Fraundorfer, Lionel Heng, Dominik Honegger, Gim Hee Lee, Lorenz Meier, Petri Tanskanen, and Marc Pollefeys. 2012. Vision-based autonomous mapping and exploration using a quadrotor MAV. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 4557–4564.
    18. Simon Fuhrmann and Michael Goesele. 2014. Floating scale surface reconstruction. ACM Transactions on Graphics (TOG) 33, 4 (2014), 46.
    19. Simon Fuhrmann, Fabian Langguth, and Michael Goesele. 2014. MVE-A multi-view reconstruction environment. In GCH. 11–18.
    20. Simon Fuhrmann, Fabian Langguth, Nils Moehrle, Michael Waechter, and Michael Goesele. 2015. MVE an image-based reconstruction environment. Computers 8 Graphics 53 (2015), 44–53.
    21. Yasutaka Furukawa and Jean Ponce. 2010. Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 8 (2010), 1362–1376.
    22. Michael Goesele, Brian Curless, and Steven M. Seitz. 2006. Multi-view stereo revisited. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, Vol. 2. IEEE, 2402–2409.
    23. Michael Goesele, Noah Snavely, Brian Curless, Hugues Hoppe, and Steven M. Seitz. 2007. Multi-view stereo for community photo collections. In IEEE 11th International Conference on Computer Vision (ICCV’07). IEEE, 1–8.
    24. Sebastian Haner and Anders Heyden. 2012. Covariance propagation and next best view planning for 3d reconstruction. In Computer Vision–ECCV 2012. Springer, 545–556.
    25. Richard Hartley and Andrew Zisserman. 2003. Multiple View Geometry in Computer Vision. Cambridge University Press.
    26. L. Heng, G. H. Lee, F. Fraundorfer, and M. Pollefeys. 2011. Real-time photo-realistic 3D mapping for micro aerial vehicles. In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. 4012–4019.
    27. Christof Hoppe, Manfred Klopschitz, Markus Rumpler, Andreas Wendel, Stefan Kluckner, Horst Bischof, and Gerhard Reitmayr. 2012a. Online feedback for structure-from-motion image acquisition. In BMVC, Vol. 2. 6.
    28. Christof Hoppe, Andreas Wendel, Stefanie Zollmann, Katrin Pirker, Arnold Irschara, Horst Bischof, and Stefan Kluckner. 2012b. Photogrammetric camera network design for micro aerial vehicles. In Computer Vision Winter Workshop (CVWW), Vol. 8. 1–3.
    29. Armin Hornung, Kai M. Wurm, Maren Bennewitz, Cyrill Stachniss, and Wolfram Burgard. 2013. OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots (2013). Software available at http://octomap.github.com.
    30. Alexander Hornung, Boyi Zeng, and Leif Kobbelt. 2008. Image selection for improved multi-view stereo. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08). IEEE, 1–8.
    31. Rui Huang, Danping Zou, Richard Vaughan, and Ping Tan. 2017. Active image-based modeling. arXiv Preprint arXiv:1705.01010.
    32. Michal Jancosek and Tomás Pajdla. 2011. Multi-view reconstruction preserving weakly-supported surfaces. In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE, 3121–3128.
    33. Sertac Karaman and Emilio Frazzoli. 2011. Sampling-based algorithms for optimal motion planning. The International Journal of Robotics Research 30, 7 (2011), 846–894.
    34. Michael Kazhdan, Matthew Bolitho, and Hugues Hoppe. 2006. Poisson surface reconstruction. In Proceedings of the Fourth Eurographics Symposium on Geometry Processing, Vol. 7.
    35. Michael Kazhdan and Hugues Hoppe. 2013. Screened poisson surface reconstruction. ACM Transactions on Graphics (TOG) 32, 3 (2013), 29.
    36. Souhaiel Khalfaoui, Ralph Seulin, Yohan Fougerolle, and David Fofi. 2013. An efficient method for fully automatic 3D digitization of unknown objects. Computers in Industry 64, 9 (2013), 1152–1160.
    37. Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. 2017. Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics.
    38. Andreas Krause and Daniel Golovin. 2012. Submodular function maximization. Tractability: Practical Approaches to Hard Problems 3, 19 (2012), 8.
    39. Simon Kriegel, Christian Rink, Tim Bodenmüller, and Michael Suppa. 2015. Efficient next-best-scan planning for autonomous 3D surface reconstruction of unknown objects. Journal of Real-Time Image Processing 10, 4 (2015), 611–631.
    40. Kiriakos N. Kutulakos and Charles R. Dyer. 1992. Recovering shape by purposive viewpoint adjustment. In Proceedings (CVPR’92). 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1992. IEEE, 16–22.
    41. Fabian Langguth, Kalyan Sunkavalli, Sunil Hadap, and Michael Goesele. 2016. Shading-aware multi-view stereo. In European Conference on Computer Vision. Springer, 469–485.
    42. Giuseppe Loianno, Justin Thomas, and Vijay Kumar. 2015. Cooperative localization and mapping of MAVs using RGB-D sensors. In 2015 IEEE International Conference on Robotics and Automation (ICRA’15). IEEE, 4021–4028.
    43. Scott Mason et al. 1997. Heuristic reasoning strategy for automated sensor placement. Photogrammetric Engineering and Remote Sensing 63, 9 (1997), 1093–1101.
    44. Oscar Mendez, Simon Hadfield, Nicolas Pugeault, and Richard Bowden. 2017. Taking the scenic route to 3D: Optimising reconstruction from moving cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4677–4685.
    45. Richard A. Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J. Davison, Pushmeet Kohi, Jamie Shotton, Steve Hodges, and Andrew Fitzgibbon. 2011. KinectFusion: Real-time dense surface mapping and tracking. In 2011 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 127–136.
    46. Matthias Nießner, Michael Zollhöfer, Shahram Izadi, and Marc Stamminger. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM Transactions on Graphics (TOG) 32, 6 (2013), 169.
    47. Gustavo Olague and Roger Mohr. 2002. Optimal camera placement for accurate reconstruction. Pattern Recognition 35, 4 (2002), 927–944.
    48. Pix4D. 2017. Pix4D. https://pix4d.com/.
    49. Mike Roberts, Anh Truong, Debadeepta Dey, Sudipta Sinha, Ashish Kapoor, Neel Joshi, and Pat Hanrahan. 2017. Submodular trajectory optimization for aerial 3D scanning. arXiv Preprint arXiv:1705.00703.
    50. 3DR Robotics. 2017. 3DR Site Scan. https://3dr.com/.
    51. Davide Scaramuzza, Michael C. Achtelik, Lefteris Doitsidis, Fraundorfer Friedrich, Elias Kosmatopoulos, Agostino Martinelli, Markus W. Achtelik, Margarita Chli, Savvas Chatzichristofis, Laurent Kneip, et al. 2014. Vision-controlled micro flying robots: From system design to autonomous navigation and mapping in GPS-denied environments. IEEE Robotics 8 Automation Magazine 21, 3 (2014), 26–40.
    52. Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-motion revisited. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
    53. Johannes Lutz Schönberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. 2016. Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision (ECCV).
    54. Thomas Schöps, Johannes L Schönberger, Silvano Galliani, Torsten Sattler, Konrad Schindler, Marc Pollefeys, and Andreas Geiger. 2017. A multi-view stereo benchmark with high-resolution images and multi-camera videos.
    55. Steven M. Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 1. IEEE, 519–528.
    56. Shaojie Shen, Nathan Michael, and Vijay Kumar. 2011. Autonomous multi-floor indoor navigation with a computationally constrained MAV. In 2011 IEEE International Conference on Robotics and Automation (ICRA’11). IEEE, 20–25.
    57. Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2006. Photo tourism: Exploring photo collections in 3D. In ACM Transactions on Graphics (TOG), Vol. 25. ACM, 835–846.
    58. Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2008. Modeling the world from internet photo collections. International Journal of Computer Vision 80, 2 (2008), 189–210.
    59. Christoph Strecha, Wolfgang Von Hansen, Luc Van Gool, Pascal Fua, and Ulrich Thoennessen. 2008. On benchmarking camera calibration and multi-view stereo for high resolution imagery. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. Ieee, 1–8.
    60. Jürgen Sturm, Erik Bylow, Christian Kerl, Fredrik Kahl, and D. Cremer. 2013. Dense tracking and mapping with a quadrocopter. Unmanned Aerial Vehicle in Geomatics (UAV-g), Rostock, Germany.
    61. Sebastian Thrun, Wolfram Burgard, and Dieter Fox. 2005. Probabilistic Robotics. MIT Press.
    62. Bill Triggs, Philip F. McLauchlan, Richard I. Hartley, and Andrew W. Fitzgibbon. 1999. Bundle adjustment a modern synthesis. In International Workshop on Vision Algorithms. Springer, 298–372.
    63. Pere-Pau Vázquez, Miquel Feixas, Mateu Sbert, and Wolfgang Heidrich. 2003. Automatic view selection using viewpoint entropy and its application to image-based modelling. In Computer Graphics Forum, Vol. 22. Wiley Online Library, 689–700.
    64. Michael Waechter, Nils Moehrle, and Michael Goesele. 2014. Let there be color! Large-scale texturing of 3D reconstructions. In European Conference on Computer Vision. Springer, 836–850.
    65. Stefan Wenhardt, Benjamin Deutsch, Elli Angelopoulou, and Heinrich Niemann. 2007. Active visual object reconstruction using d-, e-, and t-optimal next best views. In 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1–7.
    66. Laurence A. Wolsey. 1982. An analysis of the greedy algorithm for the submodular set covering problem. Combinatorica 2, 4 (1982), 385–393.
    67. Changchang Wu, Sameer Agarwal, Brian Curless, and Steven M. Seitz. 2011. Multicore bundle adjustment. In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE, 3057–3064.
    68. Shihao Wu, Wei Sun, Pinxin Long, Hui Huang, Daniel Cohen-Or, Minglun Gong, Oliver Deussen, and Baoquan Chen. 2014. Quality-driven Poisson-guided autoscanning. ACM Transactions on Graphics 33, 6.
    69. Kai Xu, Hui Huang, Yifei Shi, Hao Li, Pinxin Long, Jianong Caichen, Wei Sun, and Baoquan Chen. 2015. Autoscanning for coupled scene reconstruction and proactive object analysis. ACM Trans. Graph. 34, 6, Article 177 (Oct. 2015), 14 pages.
    70. Brian Yamauchi. 1997. A frontier-based approach for autonomous exploration. In Proceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation(CIRA’97). IEEE, 146–151.
    71. Haifeng Zhang and Yevgeniy Vorobeychik. 2016. Submodular optimization with routing constraints. In AAAI. 819–826.

ACM Digital Library Publication:

Overview Page: