Light field video capture using a learning-based hybrid imaging system

Light field cameras have many advantages over traditional cameras, as they allow the user to change various camera settings after capture. However, capturing light fields requires a huge bandwidth to record the data: a modern light field camera can only take three images per second. This prevents current consumer light field cameras from capturing light field videos. Temporal interpolation at such extreme scale (10x, from 3 fps to 30 fps) is infeasible as too much information will be entirely missing between adjacent frames. Instead, we develop a hybrid imaging system, adding another standard video camera to capture the temporal information. Given a 3 fps light field sequence and a standard 30 fps 2D video, our system can then generate a full light field video at 30 fps. We adopt a learning-based approach, which can be decomposed into two steps: spatio-temporal flow estimation and appearance estimation. The flow estimation propagates the angular information from the light field sequence to the 2D video, so we can warp input images to the target view. The appearance estimation then combines these warped images to output the final pixels. The whole process is trained end-to-end using convolutional neural networks. Experimental results demonstrate that our algorithm outperforms current video interpolation methods, enabling consumer light field videography, and making applications such as refocusing and parallax view generation achievable on videos for the first time.

References:

1. Simon Baker, Daniel Scharstein, JP Lewis, Stefan Roth, Michael J Black, and Richard Szeliski. 2011. A database and evaluation methodology for optical flow. International Journal of Computer Vision (IJCV) 92, 1 (2011), 1–31. Google ScholarDigital Library
2. Moshe Ben-Ezra and Shree K Nayar. 2003. Motion deblurring using hybrid imaging. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1. I-657. Google ScholarCross Ref
3. Pravin Bhat, C Lawrence Zitnick, Noah Snavely, Aseem Agarwala, Maneesh Agrawala, Michael Cohen, Brian Curless, and Sing Bing Kang. 2007. Using photographs to enhance videos of a static scene. In Eurographics Symposium on Rendering (EGSR). 327–338.Google Scholar
4. Tom E Bishop, Sara Zanetti, and Paolo Favaro. 2009. Light field superresolution. In IEEE International Conference on Computational Photography (ICCP). 1–9.Google ScholarCross Ref
5. Vivek Boominathan, Kaushik Mitra, and Ashok Veeraraghavan. 2014. Improving resolution and depth-of-field of light field cameras using a hybrid imaging system. In IEEE International Conference on Computational Photography (ICCP). 1–10. Google ScholarCross Ref
6. Xun Cao, Xin Tong, Qionghai Dai, and Stephen Lin. 2011. High resolution multispectral video capture with a hybrid camera system. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 297–304. Google ScholarDigital Library
7. Gaurav Chaurasia, Sylvain Duchene, Olga Sorkine-Hornung, and George Drettakis. 2013. Depth Synthesis and Local Warps for Plausible Image-based Navigation. ACM Transactions on Graphics (TOG) 32, 3 (2013), 30:1–30:12.Google ScholarDigital Library
8. Donghyeon Cho, Sunyeong Kim, and Yu-Wing Tai. 2014. Consistent matting for light field images. In European Conference on Computer Vision (ECCV). 90–104. Google ScholarCross Ref
9. Donghyeon Cho, Minhaeng Lee, Sunyeong Kim, and Yu-Wing Tai. 2013. Modeling the calibration pipeline of the lytro camera for high quality light-field image reconstruction. In IEEE International Conference on Computer Vision (ICCV). 3280–3287. Google ScholarDigital Library
10. Alexey Dosovitskiy, Philipp Fischery, Eddy Ilg, Caner Hazirbaş, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, and Thomas Brox. 2015. Flownet: Learning optical flow with convolutional networks. In IEEE International Conference on Computer Vision (ICCV). 2758–2766. Google ScholarDigital Library
11. Martin Eisemann, Bert De Decker, Marcus Magnor, Philippe Bekaert, Edilson De Aguiar, Naveed Ahmed, Christian Theobalt, and Anita Sellent. 2008. Floating Textures. Computer Graphics Forum 27, 2 (2008), 409–418. Google ScholarCross Ref
12. John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. 2016. DeepStereo: Learning to Predict New Views from the World’s Imagery. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5515–5524.Google Scholar
13. Michael Goesele, Jens Ackermann, Simon Fuhrmann, Carsten Haubold, Ronny Klowsky, Drew Steedly, and Richard Szeliski. 2010. Ambient point clouds for view interpolation. ACM Transactions on Graphics (TOG) 29, 4 (2010), 95.Google ScholarDigital Library
14. Yoav HaCohen, Eli Shechtman, Dan B Goldman, and Dani Lischinski. 2011. Non-rigid dense correspondence with applications for image enhancement. ACM Transactions on Graphics (TOG) 30, 4 (2011), 70.Google ScholarDigital Library
15. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In IEEE International Conference on Computer Vision (ICCV). 1026–1034. Google ScholarDigital Library
16. Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504–507. Google ScholarCross Ref
17. Hae-Gon Jeon, Jaesik Park, Gyeongmin Choe, Jinsun Park, Yunsu Bok, Yu-Wing Tai, and In So Kweon. 2015. Accurate depth map estimation from a lenslet light field camera. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1547–1555. Google ScholarCross Ref
18. Nima Khademi Kalantari, Ting-Chun Wang, and Ravi Ramamoorthi. 2016. Learning-based view synthesis for light field cameras. ACM Transactions on Graphics (TOG) 35, 6 (2016), 193.Google ScholarDigital Library
19. Kevin Karsch, Ce Liu, and Sing Bing Kang. 2014. DepthTransfer: Depth extraction from video using non-parametric sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 36, 11 (2014), 2144–2158. Google ScholarCross Ref
20. Rei Kawakami, Yasuyuki Matsushita, John Wright, Moshe Ben-Ezra, Yu-Wing Tai, and Katsushi Ikeuchi. 2011. High-resolution hyperspectral imaging via matrix factorization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2329–2336. Google ScholarDigital Library
21. Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
22. Janusz Konrad, Meng Wang, and Prakash Ishwar. 2012. 2D-to-3D image conversion by learning depth from examples. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop. 16–22.Google ScholarCross Ref
23. Janusz Konrad, Meng Wang, Prakash Ishwar, Chen Wu, and Debargha Mukherjee. 2013. Learning-based, automatic 2D-to-3D image and video conversion. IEEE Transactions on Image Processing (TIP) 22, 9 (2013), 3485–3496. Google ScholarDigital Library
24. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324. Google ScholarCross Ref
25. Anat Levin and Fredo Durand. 2010. Linear view synthesis using a dimensionality gap light field prior. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1831–1838. Google ScholarCross Ref
26. Nianyi Li, Jinwei Ye, Yu Ji, Haibin Ling, and Jingyi Yu. 2014. Saliency detection on light field. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2806–2813. Google ScholarDigital Library
27. Jing Liao, Rodolfo S Lima, Diego Nehab, Hugues Hoppe, Pedro V Sander, and Jinhui Yu. 2014. Automating image morphing using structural similarity on a halfway domain. ACM Transactions on Graphics (TOG) 33, 5 (2014), 168.Google ScholarDigital Library
28. Lytro Cinema. 2017. The ultimate creative tool for cinema and broadcast. https://www.lytro.com/cinema. (2017).Google Scholar
29. Dhruv Mahajan, Fu-Chung Huang, Wojciech Matusik, Ravi Ramamoorthi, and Peter Belhumeur. 2009. Moving gradients: a path-based method for plausible image interpolation. ACM Transactions on Graphics (TOG) 28, 3 (2009), 42.Google ScholarDigital Library
30. Kshitij Marwah, Gordon Wetzstein, Yosuke Bando, and Ramesh Raskar. 2013. Compressive Light Field Photography Using Overcomplete Dictionaries and Optimized Projections. ACM Transactions on Graphics (TOG) 32, 4 (2013), 46:1–46:12.Google ScholarDigital Library
31. Simone Meyer, Oliver Wang, Henning Zimmer, Max Grosse, and Alexander Sorkine-Hornung. 2015. Phase-Based Frame Interpolation for Video. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1410–1418. Google ScholarCross Ref
32. Kaushik Mitra and Ashok Veeraraghavan. 2012. Light field denoising, light field super-resolution and stereo camera based refocussing using a GMM light field patch prior. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop. 22–28.Google Scholar
33. RayTrix. 2017. 3D Light Field Camera Technology. https://www.raytrix.de/. (2017).Google Scholar
34. Red Camera. 2017. Red Digital Cinema Camera. http://www.red.com/. (2017).Google Scholar
35. Jerome Revaud, Philippe Weinzaepfel, Zaid Harchaoui, and Cordelia Schmid. 2015. EpicFlow: Edge-preserving interpolation of correspondences for optical flow. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1164–1172. Google ScholarCross Ref
36. Harpreet S Sawhney, Yanlin Guo, Keith Hanna, Rakesh Kumar, Sean Adkins, and Samuel Zhou. 2001. Hybrid stereo camera: an IBR approach for synthesis of very high resolution stereoscopic image sequences. In ACM SIGGRAPH. 451–460.Google Scholar
37. Lixin Shi, Haitham Hassanieh, Abe Davis, Dina Katabi, and Fredo Durand. 2014. Light Field Reconstruction Using Sparsity in the Continuous Fourier Domain. ACM Transactions on Graphics (TOG) 34, 1 (2014), 12:1–12:13.Google ScholarDigital Library
38. Ting-Chun Wang, Alexei A Efros, and Ravi Ramamoorthi. 2015. Occlusion-aware Depth Estimation Using Light-field Cameras. In IEEE International Conference on Computer Vision (ICCV). 3487–3495.Google Scholar
39. Ting-Chun Wang, Manohar Srikanth, and Ravi Ramamoorthi. 2016b. Depth from semi-calibrated stereo and defocus. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3717–3726. Google ScholarCross Ref
40. Ting-Chun Wang, Jun-Yan Zhu, Ebi Hiroaki, Manmohan Chandraker, Alexei A Efros, and Ravi Ramamoorthi. 2016c. A 4D light-field dataset and CNN architectures for material recognition. In European Conference on Computer Vision (ECCV). 121–138.Google ScholarCross Ref
41. Yuwang Wang, Yebin Liu, Wolfgang Heidrich, and Qionghai Dai. 2016a. The Light Field Attachment: Turning a DSLR into a Light Field Camera Using a Low Budget Camera Ring. IEEE Transactions on Visualization and Computer Graphics (TVCG) (2016).Google Scholar
42. Sven Wanner and Bastian Goldluecke. 2014. Variational Light Field Analysis for Disparity Estimation and Super-Resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 36, 3 (2014), 606–619. Google ScholarDigital Library
43. Bennett Wilburn, Michael Smulski, HH Kelin Lee, and Mark Horowitz. 2002. The light field video camera. In SPIE Proc. Media Processors, Vol. 4674.Google Scholar
44. Gaochang Wu, Mandan Zhao, Liangyong Wang, Qionghai Dai, Tianyou Chai, and Yebin Liu. 2017. Light Field Reconstruction Using Deep Convolutional Network on EPI. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
45. Youngjin Yoon, Hae-Gon Jeon, Donggeun Yoo, Joon-Young Lee, and In So Kweon. 2015. Learning a Deep Convolutional Network for Light-Field Image Super-Resolution. In IEEE International Conference on Computer Vision (ICCV) Workshop. 57–65. Google ScholarDigital Library
46. Zhoutong Zhang, Yebin Liu, and Qionghai Dai. 2015. Light field from micro-baseline image pair. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3800–3809. Google ScholarCross Ref
47. Tinghui Zhou, Shubham Tulsiani, Weilun Sun, Jitendra Malik, and Alexei A Efros. 2016. View Synthesis by Appearance Flow. In European Conference on Computer Vision (ECCV). 286–301. Google ScholarCross Ref

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2017: Technical Papers

“Light field video capture using a learning-based hybrid imaging system”

Conference:

Type(s):

Title:

Session/Category Title: Video

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: