“DeepLens: shallow depth of field from a single image” – ACM SIGGRAPH HISTORY ARCHIVES

“DeepLens: shallow depth of field from a single image”

  • 2018 SA Technical Papers_Wang_DeepLens: shallow depth of field from a single image

Conference:


Type(s):


Title:

    DeepLens: shallow depth of field from a single image

Session/Category Title:   Image processing


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    We aim to generate high resolution shallow depth-of-field (DoF) images from a single all-in-focus image with controllable focal distance and aperture size. To achieve this, we propose a novel neural network model comprised of a depth prediction module, a lens blur module, and a guided upsampling module. All modules are differentiable and are learned from data. To train our depth prediction module, we collect a dataset of 2462 RGB-D images captured by mobile phones with a dual-lens camera, and use existing segmentation datasets to improve border prediction. We further leverage a synthetic dataset with known depth to supervise the lens blur and guided upsampling modules. The effectiveness of our system and training strategies are verified in the experiments. Our method can generate high-quality shallow DoF images at high resolution, and produces significantly fewer artifacts than the baselines and existing solutions for single image shallow DoF synthesis. Compared with the iPhone portrait mode, which is a state-of-the-art shallow DoF solution based on a dual-lens depth camera, our method generates comparable results, while allowing for greater flexibility to choose focal points and aperture size, and is not limited to one capture setup.

References:


    1. Jonathan T Barron, Andrew Adams, YiChang Shih, and Carlos Hernández. 2015. Fast bilateral-space stereo for synthetic defocus. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 4466–4474.Google ScholarCross Ref
    2. Ming-Ming Cheng, Niloy J Mitra, Xiaolei Huang, Philip HS Torr, and Shi-Min Hu. 2015. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 3 (2015), 569–582.Google ScholarDigital Library
    3. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.Google ScholarCross Ref
    4. David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth map prediction from a single image using a multi-scale deep network. In Advances in neural information processing systems. 2366–2374. Google ScholarDigital Library
    5. Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
    6. Hyowon Ha, Sunghoon Im, Jaesik Park, Hae-Gon Jeon, and In So Kweon. 2016. High-quality depth from uncalibrated small motion clip. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 5413–5421.Google ScholarCross Ref
    7. Paul Haeberli and Kurt Akeley. 1990. The accumulation buffer: hardware support for high-quality rendering. ACM SIGGRAPH computer graphics 24, 4 (1990), 309–318. Google ScholarDigital Library
    8. Kaiming He, Jian Sun, and Xiaoou Tang. 2010. Guided image filtering. In Proceedings of European Conference on Computer Vision. Springer, 1–14.Google ScholarCross Ref
    9. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google ScholarCross Ref
    10. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2017), 5967–5976.Google Scholar
    11. Neel Joshi and C Lawrence Zitnick. 2014. Micro-baseline stereo. Technical Report MSR-TR-2014–73 (2014), 8.Google Scholar
    12. Felix Klose, Oliver Wang, Jean-Charles Bazin, Marcus Magnor, and Alexander Sorkine-Hornung. 2015. Sampling based scene-space video processing. ACM Transactions on Graphics 34, 4 (2015), 67. Google ScholarDigital Library
    13. Martin Kraus and Magnus Strengert. 2007. Depth-of-Field Rendering by Pyramidal Image Processing. In Computer Graphics Forum, Vol. 26. Wiley Online Library, 645–654.Google Scholar
    14. Iro Laina, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari, and Nassir Navab. 2016. Deeper depth prediction with fully convolutional residual networks. In 3D Vision, 2016 Fourth International Conference on. IEEE, 239–248.Google Scholar
    15. Sungkil Lee, Elmar Eisemann, and Hans-Peter Seidel. 2010. Real-time lens blur effects and focus control. ACM Transactions on Graphics 29, 4 (2010), 65. Google ScholarDigital Library
    16. Sungkil Lee, Gerard Jounghyun Kim, and Seungmoon Choi. 2009. Real-time depth-of-field rendering using anisotropically filtered mipmap interpolation. IEEE Transactions on Visualization and Computer Graphics 15, 3 (2009), 453–464. Google ScholarDigital Library
    17. Zhengqi Li and Noah Snavely. 2018. MegaDepth: Learning Single-View Depth Prediction from Internet Photos. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
    18. Fayao Liu, Chunhua Shen, and Guosheng Lin. 2015. Deep convolutional neural fields for depth estimation from a single image. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 5162–5170.Google ScholarCross Ref
    19. Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. 2012. Indoor Segmentation and Support Inference from RGBD Images. In Proceedings of European Conference on Computer Vision. 746–760. Google ScholarDigital Library
    20. Pratul P. Srinivasan, Rahul Garg, Neal Wadhwa, Ren Ng, and Jonathan T. Barron. 2018. Aperture Supervision for Monocular Depth Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
    21. Pratul P. Srinivasan, Tongzhou Wang, Ashwin Sreelal, Ravi Ramamoorthi, and Ren Ng. 2017. Learning to Synthesize a 4D RGBD Light Field from a Single Image. In Proceedings of IEEE International Conference on Computer Vision. 2262–2270.Google ScholarCross Ref
    22. Supasorn Suwajanakorn, Carlos Hernandez, and Steven M Seitz. 2015. Depth from focus with your mobile phone. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 3497–3506.Google ScholarCross Ref
    23. Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, Dong Wang, Baocai Yin, and Xiang Ruan. 2017. Learning to Detect Salient Objects with Image-Level Supervision. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3796–3805.Google ScholarCross Ref
    24. Ning Xu, Brian Price, Scott Cohen, and Thomas Huang. 2017. Deep image matting. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
    25. Yang Yang, Haiting Lin, Zhan Yu, Sylvain Paris, and Jingyi Yu. 2016. Virtual DSLR: High Quality Dynamic Depth-of-Field Synthesis on Mobile Platforms. Electronic Imaging 2016, 18 (2016), 1–9.Google ScholarDigital Library
    26. Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2881–2890.Google ScholarCross Ref


ACM Digital Library Publication:



Overview Page:



Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org