“Fast depth densification for occlusion-aware augmented reality” – ACM SIGGRAPH HISTORY ARCHIVES

“Fast depth densification for occlusion-aware augmented reality”

  • 2018 SA Technical Papers_Holynski_Fast depth densification for occlusion-aware augmented reality

Conference:


Type(s):


Title:

    Fast depth densification for occlusion-aware augmented reality

Session/Category Title:   Mixed reality


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    Current AR systems only track sparse geometric features but do not compute depth for all pixels. For this reason, most AR effects are pure overlays that can never be occluded by real objects. We present a novel algorithm that propagates sparse depth to every pixel in near realtime. The produced depth maps are spatio-temporally smooth but exhibit sharp discontinuities at depth edges. This enables AR effects that can fully interact with and be occluded by the real scene. Our algorithm uses a video and a sparse SLAM reconstruction as input. It starts by estimating soft depth edges from the gradient of optical flow fields. Because optical flow is unreliable near occlusions we compute forward and backward flow fields and fuse the resulting depth edges using a novel reliability measure. We then localize the depth edges by thinning and aligning them with image edges. Finally, we optimize the propagated depth smoothly but encourage discontinuities at the recovered depth edges. We present results for numerous real-world examples and demonstrate the effectiveness for several occlusion-aware AR video effects. To quantitatively evaluate our algorithm we characterize the properties that make depth maps desirable for AR applications, and present novel evaluation metrics that capture how well these are satisfied. Our results compare favorably to a set of competitive baseline algorithms in this context.

References:


    1. Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernandez Esteban, Sameer Agarwal, and Steven M. Seitz. 2016. Jump: Virtual Reality Video. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 35, 6 (2016), article no. 198. Google ScholarDigital Library
    2. Jonathan T Barron and Ben Poole. 2016. The Fast Bilateral Solver. European Conference on Computer Vision (ECCV) (2016), 617–632.Google Scholar
    3. Nicolas Bonneel, James Tompkin, Kalyan Sunkavalli, Deqing Sun, Sylvain Paris, and Hanspeter Pfister. 2015. Blind Video Temporal Consistency. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2015) 34, 6 (2015). Google ScholarDigital Library
    4. John Canny. 1986. A Computational Approach to Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 6 (1986), 679–698. Google ScholarDigital Library
    5. P. Dollár, Z. Tu, and S. Belongie. 2006. Supervised Learning of Edges and Object Boundaries. In CVPR. Google ScholarDigital Library
    6. Jakob Engel, Vladlen Koltun, and Daniel Cremers. 2018. Direct Sparse Odometry. Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2018).Google Scholar
    7. Jakob Engel, Thomas Schöps, and Daniel Cremers. 2014. LSD-SLAM: Large-Scale Direct Monocular SLAM. European Conference on Computer Vision (ECCV) (2014), 834–849.Google Scholar
    8. Yasutaka Furukawa and Carlos Hernández. 2015. Multi-View Stereo: A Tutorial. Foundations and Trends. in Computer Graphics and Vision 9, 1–2 (2015), 1–148. Google ScholarDigital Library
    9. Asmaa Hosni, Christoph Rhemann, Michael Bleyer, and Margrit Gelautz. 2011. Temporally consistent disparity and optical flow via efficient spatio-temporal filtering. In Pacific-Rim Symposium on Image and Video Technology. Springer, 165–177. Google ScholarDigital Library
    10. Till Kroeger, Radu Timofte, Dengxin Dai, and Luc Van Gool. 2016. Fast Optical Flow using Dense Inverse Search. Proceedings of the European Conference on Computer Vision (ECCV) (2016).Google ScholarCross Ref
    11. Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization Using Optimization. ACM Trans. Graph. 23, 3 (2004), 689–694. Google ScholarDigital Library
    12. D. Scaramuzza M. Pizzoli, C. Forster. 2014. REMODE: Probabilistic, monocular dense reconstruction in real time. International Conference on Robotics and Automation (ICRA) (2014), 2609–2616.Google ScholarCross Ref
    13. James McCann and Nancy S Pollard. 2008. Real-time gradient-domain painting. In ACM Transactions on Graphics (TOG), Vol. 27. ACM, 93. Google ScholarDigital Library
    14. Richard A. Newcombe, Steven J. Lovegrove, and Andrew J. Davison. 2011. DTAM: Dense Tracking and Mapping in Real-time. International Conference on Computer Vision (ICCV) (2011), 2320–2327. Google ScholarDigital Library
    15. Liyuan Pan, Yuchao Dai, Miaomiao Liu, and Fatih Porikli. 2018. Depth Map Completion by Jointly Exploiting Blurry Color Images and Sparse Depth Maps. In Applications of Computer Vision (WACV), 2018 IEEE Winter Conference on. IEEE, 1377–1386.Google ScholarCross Ref
    16. Jaesik Park, Hyeongwoo Kim, Yu-Wing Tai, Michael S Brown, and In So Kweon. 2014. High-quality depth map upsampling and completion for RGB-D cameras. IEEE Transactions on Image Processing 23, 12 (2014), 5559–5572.Google ScholarCross Ref
    17. Georg Petschnigg, Richard Szeliski, Maneesh Agrawala, Michael Cohen, Hugues Hoppe, and Kentaro Toyama. 2004. Digital Photography with Flash and No-flash Image Pairs. ACM Trans. Graph. 23, 3 (2004), 664–672. Google ScholarDigital Library
    18. J.M.M. Montiel R. Mur-Artal and Juan D. Tardos. 2015. ORB-SLAM: a Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics 31, 5 (2015), 1147–1163.Google ScholarDigital Library
    19. Christian Richardt, Douglas Orr, Ian Davies, Antonio Criminisi, and Neil A Dodgson. 2010. Real-time spatiotemporal stereo matching using the dual-cross-bilateral grid. In European conference on Computer vision. Springer, 510–523. Google ScholarDigital Library
    20. Johannes Lutz Schönberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. 2016. Pixelwise View Selection for Unstructured Multi-View Stereo. (2016).Google Scholar
    21. Steven M Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In null. IEEE, 519–528. Google ScholarDigital Library
    22. Qi Shan, Brian Curless, Yasutaka Furukawa, Carlos Hernández, and Steven M. Seitz. 2014. Occluding Contours for Multi-view Stereo. Conference on Computer Vision and Pattern Recognition (2014), 4002–4009. Google ScholarDigital Library
    23. Jan Stühmer, Stefan Gumhold, and Daniel Cremers. 2010. Real-time Dense Geometry from a Handheld Camera. Proceedings of the 32Nd DAGM Conference on Pattern Recognition (2010), 11–20. Google ScholarDigital Library
    24. Richard Szeliski. 2006. Locally Adapted Hierarchical Basis Preconditioning. ACM Trans. Graph. 25, 3 (2006), 1135–1143. Google ScholarDigital Library
    25. Chamara Saroj Weerasekera, Thanuja Dharmasiri, Ravi Garg, Tom Drummond, and Ian Reid. 2018. Just-in-Time Reconstruction: Inpainting Sparse Maps using Single View Depth Predictors as Priors. arXiv preprint arXiv:1805.04239 (2018).Google Scholar
    26. Guofeng Zhang, Jiaya Jia, Tien-Tsin Wong, and Hujun Bao. 2009. Consistent Depth Maps Recovery from a Video Sequence. Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 31, 6 (2009), 974–988. Google ScholarDigital Library
    27. Yinda Zhang and Thomas Funkhouser. 2018. Deep Depth Completion of a Single RGB-D Image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 175–185.Google ScholarCross Ref


ACM Digital Library Publication:



Overview Page:



Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org