“Seeing through obstructions with diffractive cloaking” by Shi, Bahat, Baek, Fu, Amata, et al. …

  • ©Zheng Shi, Yuval Bahat, Seung-Hwan Baek, Qiang Fu, Hadi Amata, Xiao Li, Praneeth Chakravarthula, Wolfgang Heidrich, and Felix Heide




    Seeing through obstructions with diffractive cloaking



    Unwanted camera obstruction can severely degrade captured images, including both scene occluders near the camera and partial occlusions of the camera cover glass. Such occlusions can cause catastrophic failures for various scene understanding tasks such as semantic segmentation, object detection, and depth estimation. Existing camera arrays capture multiple redundant views of a scene to see around thin occlusions. Such multi-camera systems effectively form a large synthetic aperture, which can suppress nearby occluders with a large defocus blur, but significantly increase the overall form factor of the imaging setup. In this work, we propose a monocular single-shot imaging approach that optically cloaks obstructions by emulating a large array. Instead of relying on different camera views, we learn a diffractive optical element (DOE) that performs depth-dependent optical encoding, scattering nearby occlusions while allowing paraxial wavefronts to be focused. We computationally reconstruct unobstructed images from these superposed measurements with a neural network that is trained jointly with the optical layer of the proposed imaging system. We assess the proposed method in simulation and with an experimental prototype, validating that the proposed computational camera is capable of recovering occluded scene information in the presence of severe camera obstruction.


    1. Seung-Hwan Baek and Felix Heide. 2021. Polka Lines: Learning Structured Illumination and Reconstruction for Active Stereo. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5757–5767.Google ScholarCross Ref
    2. Seung-Hwan Baek, Hayato Ikoma, Daniel S Jeon, Yuqi Li, Wolfgang Heidrich, Gordon Wetzstein, and Min H Kim. 2021. Single-shot Hyperspectral-Depth Imaging with Learned Diffractive Optics. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2651–2660.Google ScholarCross Ref
    3. Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11621–11631.Google ScholarCross Ref
    4. Wenshan Cai, Uday K Chettiar, Alexander V Kildishev, and Vladimir M Shalaev. 2007. Optical cloaking with metamaterials. Nature photonics 1, 4 (2007), 224–227.Google Scholar
    5. Ayan Chakrabarti. 2016. Learning sensor multiplexing design through back-propagation. Advances in Neural Information Processing Systems 29 (2016).Google Scholar
    6. Julie Chang and Gordon Wetzstein. 2019. Deep Optics for Monocular Depth Estimation and 3D Object Detection. ArXiv abs/1904.08601 (2019).Google Scholar
    7. Hongsheng Chen, Bin Zheng, Lian Shen, Huaping Wang, Xianmin Zhang, Nikolay I Zheludev, and Baile Zhang. 2013. Ray-optics cloaking devices for large objects in incoherent natural light. Nature communications 4, 1 (2013), 1–6.Google Scholar
    8. Xianzhong Chen, Yu Luo, Jingjing Zhang, Kyle Jiang, John B Pendry, and Shuang Zhang. 2011. Macroscopic invisibility cloaking of visible light. Nature Communications 2, 1 (2011), 1–6.Google ScholarCross Ref
    9. Ilya Chugunov, Seung-Hwan Baek, Qiang Fu, Wolfgang Heidrich, and Felix Heide. 2021. Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9116–9126.Google ScholarCross Ref
    10. Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
    11. Don M Cottrell, Jeffrey A Davis, Theodore R Hedman, and Rodger A Lilly. 1990. Multiple imaging phase-encoded optical elements written as programmable spatial light modulators. Applied optics 29, 17 (1990), 2505–2509.Google Scholar
    12. Chen Du, Byeongkeun Kang, Zheng Xu, Ji Dai, and Truong Nguyen. 2018. Accurate and efficient video de-fencing using convolutional neural networks and temporal information. In 2018 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1–6.Google ScholarCross Ref
    13. Tolga Ergin, Nicolas Stenger, Patrice Brenner, John B Pendry, and Martin Wegener. 2010. Three-dimensional invisibility cloak at optical wavelengths. science 328, 5976 (2010), 337–339.Google Scholar
    14. Muhammad Shahid Farid, Arif Mahmood, and Marco Grangetto. 2016. Image de-fencing framework with hybrid inpainting algorithm. Signal, Image and Video Processing 10, 7 (2016), 1193–1201.Google ScholarCross Ref
    15. Ficosa. 2017. Ficosa Sensor and Camera Cleaning System. Accessed Oct 18, 2021. (2017). https://www.ficosa.com/products/underhood/sensor-and-camera-cleaning/Google Scholar
    16. Adrian P Gaylard, Kerry Kirwan, and Duncan A Lockerby. 2017. Surface contamination of cars: A review. Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 231, 9 (2017), 1160–1176.Google ScholarCross Ref
    17. J.W. Goodman. 2005. Introduction to Fourier Optics. W. H. Freeman. https://books.google.com/books?id=ow5xs_Rtt9ACGoogle Scholar
    18. Xiefan Guo, Hongyu Yang, and Di Huang. 2021. Image Inpainting via Conditional Texture and Structure Dual Generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14134–14143.Google ScholarCross Ref
    19. Divyanshu Gupta, Shorya Jain, Utkarsh Tripathi, Pratik Chattopadhyay, and Lipo Wang. 2021. A robust and efficient image de-fencing approach using conditional generative adversarial networks. Signal, Image and Video Processing 15, 2 (2021), 297–305.Google ScholarCross Ref
    20. Anna-Karin Gustavsson, Petar N Petrov, Maurice Y Lee, Yoav Shechtman, and WE Moerner. 2018. 3D single-molecule super-resolution microscopy with a tilted light sheet. Nature communications 9, 1 (2018), 1–8.Google Scholar
    21. Harel Haim, Shay Elmalem, Raja Giryes, Alex Bronstein, and Emanuel Marom. 2018. Depth Estimation From a Single Image Using Deep Learned Phase Coded Mask. IEEE Transactions on Computational Imaging 4 (2018), 298–310.Google ScholarCross Ref
    22. Zhixiang Hao, Shaodi You, Yu Li, Kunming Li, and Feng Lu. 2019. Learning from synthetic photorealistic raindrop for single image raindrop removal. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 0–0.Google ScholarCross Ref
    23. Lei He, Guanghui Wang, and Zhanyi Hu. 2018. Learning Depth From Single Images With Deep Neural Network Embedding Focal Length. IEEE Transactions on Image Processing 27 (2018), 4676–4689.Google ScholarCross Ref
    24. Mazin Hnewa and Hayder Radha. 2020. Object Detection Under Rainy Conditions for Autonomous Vehicles: A Review of State-of-the-Art and Emerging Techniques. IEEE Signal Processing Magazine 38, 1 (2020), 53–67.Google ScholarCross Ref
    25. Roarke Horstmeyer, Richard Y. Chen, Barbara Kappes, and Benjamin Judkewitz. 2017. Convolutional neural networks that teach microscopes how to image. ArXiv abs/1709.07223 (2017).Google Scholar
    26. John C. Howell, J. Benjamin Howell, and Joseph S. Choi. 2014. Amplitude-only, passive, broadband, optical spatial cloaking of very large objects. Applied Optics 53, 9 (Mar 2014), 1958. Google ScholarCross Ref
    27. Aaron Isaksen, Leonard McMillan, and Steven J Gortler. 2000. Dynamically reparameterized light fields. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques. 297–306.Google ScholarDigital Library
    28. Julian Iseringhausen, Bastian Goldlücke, Nina Pesheva, Stanimir Iliev, Alexander Wender, Martin Fuchs, and Matthias B Hullin. 2017. 4D imaging through spray-on optics. ACM Transactions on Graphics (TOG) 36, 4 (2017), 1–11.Google ScholarDigital Library
    29. Zijie Jiang, Qingxuan Liang, Zhaohui Li, Tianning Chen, Dichen Li, and Yang Hao. 2020. A 3D Carpet Cloak with Non-Euclidean Metasurfaces. Advanced Optical Materials 8, 19 (2020), 2000827.Google ScholarCross Ref
    30. Sankaraganesh Jonna, Krishna K Nakka, and Rajiv R Sahay. 2015. My camera can see through fences: A deep learning approach for image de-fencing. In 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR). IEEE, 261–265.Google ScholarCross Ref
    31. Sankaraganesh Jonna, Krishna K Nakka, and Rajiv R Sahay. 2016. Deep learning based fence segmentation and removal from an image using a video sequence. In European Conference on Computer Vision. Springer, 836–851.Google ScholarCross Ref
    32. Michael Kellman, Emrah Bostan, Michael Chen, and Laura Waller. 2019. Data-Driven Design for Fourier Ptychographic Microscopy. In 2019 IEEE International Conference on Computational Photography (ICCP). IEEE, 1–8.Google Scholar
    33. Orest Kupyn, Tetiana Martyniuk, Junru Wu, and Zhangyang Wang. 2019. DeblurGANv2: Deblurring (Orders-of-Magnitude) Faster and Better. In The IEEE International Conference on Computer Vision (ICCV).Google Scholar
    34. Xiaoyu Li, Bo Zhang, Jing Liao, and Pedro V. Sander. 2021. Let’s See Clearly: Contaminant Artifact Removal for Moving Cameras. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2011–2020.Google Scholar
    35. Yanxi Liu, Tamara Belkina, James Hays, and Roberto Lublinerman. 2008. Image defencing. 2008 IEEE Conference on Computer Vision and Pattern Recognition (2008), 1–8.Google Scholar
    36. Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, and Jia-Bin Huang. 2020. Learning to see through obstructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14215–14224.Google ScholarCross Ref
    37. Julio Marco, Quercus Hernandez, Adolfo Muñoz, Yue Dong, Adrián Jarabo, Min H. Kim, Xin Tong, and Diego Gutierrez. 2017. DeepToF: off-the-shelf real-time correction of multipath interference in time-of-flight imaging. ACM Trans. Graph. 36 (2017), 219:1–219:12.Google ScholarDigital Library
    38. Takuro Matsui and Masaaki Ikehara. 2019. Single-image fence removal using deep convolutional neural network. IEEE Access 8 (2019), 38846–38854.Google ScholarCross Ref
    39. T. Matsui and M. Ikehara. 2020. Single-Image Fence Removal Using Deep Convolutional Neural Network. IEEE Access 8 (2020), 38846–38854.Google ScholarCross Ref
    40. Christopher A Metzler, Hayato Ikoma, Yifan Peng, and Gordon Wetzstein. 2019. Deep Optics for Single-shot High-dynamic-range Imaging. arXiv preprint arXiv:1908.00620 (2019).Google Scholar
    41. Rolf Monrad. 2017. Mostad Mekaniske – Apparatus for Cleaning Object Surface. Accessed Oct 18, 2021. (2017). https://uspto.report/patent/app/20200188965Google Scholar
    42. Ignacio Moreno, Don M Cottrell, Jeffrey A Davis, María M Sánchez-López, and Benjamin K Gutierrez. 2020. In-phase sub-Nyquist lenslet arrays encoded onto spatial light modulators. JOSA A 37, 9 (2020), 1417–1422.Google ScholarCross Ref
    43. Elias Nehme, Daniel Freedman, Racheli Gordon, Boris Ferdman, Lucien E Weiss, Onit Alalouf, Tal Naor, Reut Orange, Tomer Michaeli, and Yoav Shechtman. 2020. Deep-STORM3D: dense 3D localization microscopy and PSF design by deep learning. Nature methods 17, 7 (2020), 734–740.Google Scholar
    44. Orlaco. 2013. Orlaco All Time Vision Camera. Accessed Oct 18, 2021. (2013). https://rmtequip.com/en/products/orlaco-all-time-vision-cameraGoogle Scholar
    45. Sri Rama Prasanna Pavani and Rafael Piestun. 2008. Three dimensional tracking of fluorescent microparticles using a photon-limited double-helix response system. Optics express 16, 26 (2008), 22048–22057.Google Scholar
    46. Zhao Pei, Yanning Zhang, Xida Chen, and Yee-Hong Yang. 2013. Synthetic aperture imaging using pixel labeling via energy minimization. Pattern Recognition 46, 1 (2013), 174–187.Google ScholarDigital Library
    47. Yifan Peng, Qilin Sun, Xiong Dun, Gordon Wetzstein, Wolfgang Heidrich, and Felix Heide. 2019. Learned large field-of-view imaging with thin-plate optics. ACM Trans. Graph. (Proc. Siggraph Asia) 38, 6 (2019), 219–1.Google ScholarDigital Library
    48. Ken Perlin. 1985. An image synthesizer. ACM Siggraph Computer Graphics 19, 3 (1985), 287–296.Google ScholarDigital Library
    49. Rui Qian, Robby T Tan, Wenhan Yang, Jiajun Su, and Jiaying Liu. 2018. Attentive generative adversarial network for raindrop removal from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2482–2491.Google ScholarCross Ref
    50. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234–241.Google ScholarCross Ref
    51. Rotoclear S3. 2016. Rotoclear S3 Spin Window. Accessed Oct 18, 2021. (2016). https://cromar.co.uk/products/accessories/roto-clear-spin-window/Google Scholar
    52. Yoav Shechtman, Lucien E Weiss, Adam S. Backer, Maurice Y. Lee, and W E Moerner. 2016. Multicolour localization microscopy by point-spread-function engineering. Nature photonics 10 (2016), 590–594.Google Scholar
    53. Vincent Sitzmann, Steven Diamond, Yifan Peng, Xiong Dun, Stephen Boyd, Wolfgang Heidrich, Felix Heide, and Gordon Wetzstein. 2018. End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging. ACM Transactions on Graphics (TOG) 37, 4 (2018), 114.Google ScholarDigital Library
    54. Shuochen Su, Felix Heide, Gordon Wetzstein, and Wolfgang Heidrich. 2018. Deep End-to-End Time-of-Flight Imaging. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), 6383–6392.Google Scholar
    55. Qilin Sun, Ethan Tseng, Qiang Fu, Wolfgang Heidrich, and Felix Heide. 2020. Learning Rank-1 Diffractive Optics for Single-shot High Dynamic Range Imaging. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
    56. Qilin Sun, Congli Wang, Fu Qiang, Dun Xiong, and Heidrich Wolfgang. 2021. End-to-end complex lens design with differentiable ray tracing. ACM Transactions on Graphics (Proc. Siggraph) 40, 4 (2021), 1–13.Google ScholarDigital Library
    57. Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, and Victor Lempitsky. 2021. Resolution-robust Large Mask Inpainting with Fourier Convolutions. arXiv preprint arXiv:2109.07161 (2021).Google Scholar
    58. Ethan Tseng, Shane Colburn, James Whitehead, Luocheng Huang, Seung-Hwan Baek, Arka Majumdar, and Felix Heide. 2021a. Neural Nano-Optics for High-quality Thin Lens Imaging. Nature Communications 12, 1 (2021), 6493.Google ScholarCross Ref
    59. Ethan Tseng, Ali Mosleh, Fahim Mannan, Karl St-Arnaud, Avinash Sharma, Yifan Peng, Alexander Braun, Derek Nowrouzezahrai, Jean-Francois Lalonde, and Felix Heide. 2021b. Differentiable Compound Optics and Processing Pipeline Optimization for End-to-end Camera Design. ACM Transactions on Graphics (TOG) 40, 2, Article 18 (2021).Google ScholarDigital Library
    60. Michal Uricar, Ganesh Sistu, Hazem Rashed, Antonin Vobecky, Varun Ravi Kumar, Pavel Krizek, Fabian Burger, and Senthil Yogamani. 2021. Let’s Get Dirty: GAN Based Data Augmentation for Camera Lens Soiling Detection in Autonomous Driving. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 766–775.Google ScholarCross Ref
    61. Vaibhav Vaish, Gaurav Garg, Eino-Ville Talvala, Emilio Antunez, Bennett Wilburn, Mark Horowitz, and Marc Levoy. 2005. Synthetic aperture focusing using a shear-warp factorization of the viewing transform. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05)-Workshops. IEEE, 129–129.Google ScholarDigital Library
    62. Vaibhav Vaish, Marc Levoy, Richard Szeliski, C Lawrence Zitnick, and Sing Bing Kang. 2006. Reconstructing occluded surfaces using synthetic apertures: Stereo, focus and robust measures. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 2. IEEE, 2331–2338.Google Scholar
    63. Vaibhav Vaish, Bennett Wilburn, Neel Joshi, and Marc Levoy. 2004. Using plane+ parallax for calibrating dense camera arrays. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Vol. 1. IEEE, I–I.Google ScholarCross Ref
    64. Yingqian Wang, Tianhao Wu, Jungang Yang, Longguang Wang, Wei An, and Yulan Guo. 2020. DeOccNet: Learning to see through foreground occlusions in light fields. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 118–127.Google ScholarCross Ref
    65. Waymo. 2019. Waymo Open Dataset: An autonomous driving dataset. (2019).Google Scholar
    66. Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Eino-Ville Talvala, Emilio Antunez, Adam Barth, Andrew Adams, Mark Horowitz, and Marc Levoy. 2005. High Performance Imaging Using Large Camera Arrays. ACM Trans. Graph. 24, 3 (Jul 2005), 765–776. Google ScholarDigital Library
    67. Yicheng Wu, Vivek Boominathan, Huaijin Chen, Aswin Sankaranarayanan, and Ashok Veeraraghavan. 2019. PhaseCam3D — Learning Phase Masks for Passive Single View Depth Estimation. 2019 IEEE International Conference on Computational Photography (ICCP) (2019), 1–12.Google ScholarCross Ref
    68. Zhaolin Xiao, Lipeng Si, and Guoqing Zhou. 2017. Seeing beyond foreground occlusion: a joint framework for sap-based scene depth and appearance reconstruction. IEEE Journal of Selected Topics in Signal Processing 11, 7 (2017), 979–991.Google ScholarCross Ref
    69. Tianfan Xue, Michael Rubinstein, Ce Liu, and William T. Freeman. 2015. A Computational Approach for Obstruction-Free Photography. ACM Transactions on Graphics (Proc. SIGGRAPH) 34, 4 (2015).Google Scholar
    70. Atsushi Yamashita, Akiyoshi Matsui, and Toru Kaneko. 2010. Fence removal from multi-focus images. In 2010 20th International Conference on Pattern Recognition. IEEE, 4532–4535.Google ScholarDigital Library
    71. Zhaoyi Yan, Xiaoming Li, Mu Li, Wangmeng Zuo, and Shiguang Shan. 2018. Shift-net: Image inpainting via deep feature rearrangement. In Proceedings of the European conference on computer vision (ECCV). 1–17.Google ScholarDigital Library
    72. Tao Yang, Yanning Zhang, Jingyi Yu, Jing Li, Wenguang Ma, Xiaomin Tong, Rui Yu, and Lingyan Ran. 2014. All-in-focus synthetic aperture imaging. In European Conference on Computer Vision. Springer, 1–15.Google ScholarCross Ref
    73. Wenxia Yang, Xin Li, and Liang Zhang. 2021. Toward semantic image inpainting: where global context meets local geometry. Journal of Electronic Imaging 30, 2 (2021), 023028.Google ScholarCross Ref
    74. Senthil Yogamani, Ciarán Hughes, Jonathan Horgan, Ganesh Sistu, Padraig Varley, Derek O’Dea, Michal Uricár, Stefan Milz, Martin Simon, Karl Amende, et al. 2019. Woodscape: A multi-task, multi-camera fisheye dataset for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9308–9318.Google ScholarCross Ref
    75. Shaodi You, Robby T Tan, Rei Kawakami, and Katsushi Ikeuchi. 2013. Adherent raindrop detection and removal in video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1035–1042.Google ScholarDigital Library
    76. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2019. Free-form image inpainting with gated convolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4471–4480.Google ScholarCross Ref
    77. Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017).Google Scholar

ACM Digital Library Publication: