“Learning-based view synthesis for light field cameras”
Conference:
Type(s):
Title:
- Learning-based view synthesis for light field cameras
Session/Category Title: Computational Photography
Presenter(s)/Author(s):
Abstract:
With the introduction of consumer light field cameras, light field imaging has recently become widespread. However, there is an inherent trade-off between the angular and spatial resolution, and thus, these cameras often sparsely sample in either spatial or angular domain. In this paper, we use machine learning to mitigate this trade-off. Specifically, we propose a novel learning-based approach to synthesize new views from a sparse set of input views. We build upon existing view synthesis techniques and break down the process into disparity and color estimation components. We use two sequential convolutional neural networks to model these two components and train both networks simultaneously by minimizing the error between the synthesized and ground truth images. We show the performance of our approach using only four corner sub-aperture views from the light fields captured by the Lytro Illum camera. Experimental results show that our approach synthesizes high-quality images that are superior to the state-of-the-art techniques on a variety of challenging real-world scenes. We believe our method could potentially decrease the required angular resolution of consumer light field cameras, which allows their spatial resolution to increase.
References:
1. Adelson, E. H., and Wang, J. Y. A. 1992. Single lens stereo with a plenoptic camera. IEEE PAMI 14, 2, 99–106.
2. Bishop, T. E., Zanetti, S., and Favaro, P. 2009. Light field superresolution. In IEEE ICCP, 1–9.
3. Burger, H. C., Schuler, C. J., and Harmeling, S. 2012. Image denoising: Can plain neural networks compete with BM3D? In IEEE CVPR, 2392–2399.
4. Chaurasia, G., Sorkine, O., and Drettakis, G. 2011. Silhouette-aware warping for image-based rendering. In EGSR, 1223–1232.
5. Chaurasia, G., Duchene, S., Sorkine-Hornung, O., and Drettakis, G. 2013. Depth synthesis and local warps for plausible image-based navigation. ACM TOG 32, 3, 30:1–30:12.
6. Cho, D., Lee, M., Kim, S., and Tai, Y.-W. 2013. Modeling the calibration pipeline of the lytro camera for high quality light-field image reconstruction. In IEEE ICCV, 3280–3287.
7. Dong, C., Loy, C. C., He, K., and Tang, X. 2014. Learning a deep convolutional network for image super-resolution. In ECCV, 184–199.
8. Dosovitskiy, A., Springenberg, J. T., and Brox, T. 2015. Learning to generate chairs with convolutional neural networks. In IEEE CVPR, 1538–1546.
9. Eisemann, M., De Decker, B., Magnor, M., Bekaert, P., De Aguiar, E., Ahmed, N., Theobalt, C., and Sellent, A. 2008. Floating textures. CGF 27, 2, 409–418. Cross Ref
10. Fitzgibbon, A., Wexler, Y., and Zisserman, A. 2003. Image-based rendering using image-based priors. In IEEE ICCV, 1176–1183 vol.2.
11. Flynn, J., Neulander, I., Philbin, J., and Snavely, N. 2016. Deepstereo: Learning to predict new views from the worlds imagery. In IEEE CVPR, 5515–5524.
12. Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE PAMI 32, 8, 1362–1376.
13. Georgiev, T., Zheng, K. C., Curless, B., Salesin, D., Nayar, S., and Intwala, C. 2006. Spatio-angular resolution tradeoffs in integral photography. In EGSR, 263–272.
14. Girod, B., Chang, C.-L., Ramanathan, P., and Zhu, X. 2003. Light field compression using disparity-compensated lifting. In IEEE ICME, vol. 1, I–373–6 vol.1.
15. Glorot, X., and Bengio, Y. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS, vol. 9, 249–256.
16. Goesele, M., Ackermann, J., Fuhrmann, S., Haubold, C., Klowsky, R., Steedly, D., and Szeliski, R. 2010. Ambient point clouds for view interpolation. ACM TOG 29, 4, 95.
17. Heber, S., and Pock, T. 2016. Convolutional networks for shape from light field. In IEEE CVPR.
18. Jeon, H. G., Park, J., Choe, G., Park, J., Bok, Y., Tai, Y. W., and Kweon, I. S. 2015. Accurate depth map estimation from a lenslet light field camera. In IEEE CVPR, 1547–1555.
19. Kholgade, N., Simon, T., Efros, A., and Sheikh, Y. 2014. 3D object manipulation in a single photograph using stock 3D models. ACM TOG 33, 4, 127.
20. Kingma, D., and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
21. Levin, A., and Durand, F. 2010. Linear view synthesis using a dimensionality gap light field prior. In IEEE CVPR, 1831–1838.
22. Levoy, M., and Hanrahan, P. 1996. Light field rendering. In ACM SIGGRAPH, 31–42.
23. Lytro, 2016. https://www.lytro.com/.
24. Mahajan, D., Huang, F.-C., Matusik, W., Ramamoorthi, R., and Belhumeur, P. 2009. Moving gradients: a path-based method for plausible image interpolation. ACM TOG 28, 3, 42.
25. Marwah, K., Wetzstein, G., Bando, Y., and Raskar, R. 2013. Compressive light field photography using overcomplete dictionaries and optimized projections. ACM TOG 32, 4, 46:1–46:12.
26. Mitra, K., and Veeraraghavan, A. 2012. Light field de-noising, light field superresolution and stereo camera based re-focussing using a GMM light field patch prior. In IEEE CVPRW, 22–28.
27. Ng, R., Levoy, M., Brédif, M., Duval, G., Horowitz, M., and Hanrahan, P. 2005. Light field photography with a hand-held plenoptic camera. Computer Science Technical Report CSTR 2, 11, 1–11.
28. Pelican Imaging, 2016. Capture life in 3D. http://www.pelicanimaging.com/.
29. Raj, A., Lowney, M., Shah, R., and Wetzstein, G., 2016. Stanford lytro light field archive. http://lightfields.stanford.edu/.
30. RayTrix, 2016. 3D light field camera technology. https://www.raytrix.de/.
31. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. 1986. Learning representations by back-propagating errors. Nature 323, 533–536. Cross Ref
32. Schedl, D. C., Birklbauer, C., and Bimber, O. 2015. Directional super-resolution by means of coded sampling and guided upsampling. In IEEE ICCP, 1–10.
33. Shechtman, E., Rav-Acha, A., Irani, M., and Seitz, S. 2010. Regenerative morphing. In IEEE CVPR, 615–622.
34. Shi, L., Hassanieh, H., Davis, A., Katabi, D., and Du-rand, F. 2014. Light field reconstruction using sparsity in the continuous fourier domain. ACM TOG 34, 1, 12:1–12:13.
35. Su, H., Wang, F., Yi, L., and Guibas, L. 2014. 3D-assisted image feature synthesis for novel views of an object. arXiv preprint arXiv:1412.0003.
36. Sun, J., Cao, W., Xu, Z., and Ponce, J. 2015. Learning a convolutional neural network for non-uniform motion blur removal. In IEEE CVPR, 769–777.
37. Tao, M. W., Hadap, S., Malik, J., and Ramamoorthi, R. 2013. Depth from combining defocus and correspondence using light-field cameras. In IEEE ICCV, 673–680.
38. Tao, M. W., Srinivasan, P. P., Malik, J., Rusinkiewicz, S., and Ramamoorthi, R. 2015. Depth from shading, defocus, and correspondence using light-field angular coherence. In IEEE CVPR, 1940–1948.
39. Tatarchenko, M., Dosovitskiy, A., and Brox, T. 2015. Single-view to multi-view: Reconstructing unseen views with a convolutional network. CoRR abs/1511.06702.
40. Tong, X., and Gray, R. M. 2003. Interactive rendering from compressed light fields. IEEE TCSVT 13, 11 (Nov), 1080–1091.
41. Vedaldi, A., and Lenc, K. 2015. MatConvNet: Convolutional neural networks for Matlab. In ACMMM, 689–692.
42. Wang, Z., Bovik, A., Sheikh, H., and Simoncelli, E. 2004. Image quality assessment: from error visibility to structural similarity. IEEE TIP 13, 4 (April), 600–612.
43. Wang, T. C., Efros, A. A., and Ramamoorthi, R. 2015. Occlusion-aware depth estimation using light-field cameras. In IEEE ICCV, 3487–3495.
44. Wanner, S., and Goldluecke, B. 2012. Globally consistent depth labeling of 4D light fields. In IEEE CVPR, 41–48.
45. Wanner, S., and Goldluecke, B. 2014. Variational light field analysis for disparity estimation and super-resolution. IEEE PAMI 36, 3, 606–619.
46. Wilburn, B., Joshi, N., Vaish, V., Talvala, E.-V., Antunez, E., Barth, A., Adams, A., Horowitz, M., and Levoy, M. 2005. High performance imaging using large camera arrays. ACM TOG 24, 3, 765–776.
47. Yang, J., Reed, S. E., Yang, M.-H., and Lee, H. 2015. Weakly-supervised disentangling with recurrent transformations for 3D view synthesis. In NIPS, 1099–1107.
48. Yoon, Y., Jeon, H. G., Yoo, D., Lee, J. Y., and Kweon, I. S. 2015. Learning a deep convolutional network for light-field image super-resolution. In IEEE ICCV Workshop, 57–65.
49. Zhang, Z., Liu, Y., and Dai, Q. 2015. Light field from micro-baseline image pair. In IEEE CVPR, 3800–3809.
50. Zhang, F. L., Wang, J., Shechtman, E., Zhou, Z. Y., Shi, J. X., and Hu, S. M. 2016. PlenoPatch: Patch-based plenoptic image manipulation. IEEE TVCG PP, 99, 1–1.
51. Zhou, T., Tulsiani, S., Sun, W., Malik, J., and Efros, A. A. 2016. View synthesis by appearance flow. CoRR abs/1605.03557.


