A Unified Framework for Compression and Compressed Sensing of Light Fields and Light Field Videos

Ehsan Miandji; Saghi Hajisharif; Jonas Unger

“A Unified Framework for Compression and Compressed Sensing of Light Fields and Light Field Videos” by Miandji, Hajisharif and Unger

Next: “A Unified Interpolary Subdivision Scheme for... »

« Previous: “A unified framework for 3D non-photorealistic...

Conference:

SIGGRAPH 2019

Type:

Technical Papers

Title:

A Unified Framework for Compression and Compressed Sensing of Light Fields and Light Field Videos

Session/Category Title: Image Science

Presenter(s)/Author(s):

Ehsan Miandji

Saghi Hajisharif

Jonas Unger

Abstract:

In this article we present a novel dictionary learning framework designed for compression and sampling of light fields and light field videos. Unlike previous methods, where a single dictionary with one-dimensional atoms is learned, we propose to train a Multidimensional Dictionary Ensemble (MDE). It is shown that learning an ensemble in the native dimensionality of the data promotes sparsity, hence increasing the compression ratio and sampling efficiency. To make maximum use of correlations within the light field data sets, we also introduce a novel nonlocal pre-clustering approach that constructs an Aggregate MDE (AMDE). The pre-clustering not only improves the image quality but also reduces the training time by an order of magnitude in most cases. The decoding algorithm supports efficient local reconstruction of the compressed data, which enables efficient real-time playback of high-resolution light field videos. Moreover, we discuss the application of AMDE for compressed sensing. A theoretical analysis is presented that indicates the required conditions for exact recovery of point-sampled light fields that are sparse under AMDE. The analysis provides guidelines for designing efficient compressive light field cameras. We use various synthetic and natural light field and light field video data sets to demonstrate the utility of our approach in comparison with the state-of-the-art learning-based dictionaries, as well as established analytical dictionaries.

References:

Edward H. Adelson and James R. Bergen. 1991. The plenoptic function and the elements of early vision. In Computational Models of Visual Processing. MIT Press, 3–20.
M. Aharon, M. Elad, and A. Bruckstein. 2006. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Sig. Proc. 54, 11 (Nov. 2006), 4311–4322.
H. Arguello and G. R. Arce. 2014. Colored coded aperture design by concentration of measure in compressive spectral imaging. IEEE Trans. Image Proc. 23, 4 (Apr. 2014), 1896–1908.
Amit Ashok and Mark A. Neifeld. 2010. Compressive light field imaging. SPIE Proceedings 7690 (2010), 1–12.
S. D. Babacan, R. Ansorge, M. Luessi, P. Ruiz Mataran, R. Molina, and A. K. Katsaggelos. 2012. Compressive light field sensing. IEEE Trans. Image Proc. 21, 12 (Dec. 2012), 4746–4757.
R. Ballester-Ripoll and R. Pajarola. 2016. Compressing bidirectional texture functions via tensor train decomposition. In Proceedings of the 24th Pacific Conference on Computer Graphics and Applications: Short Papers. Eurographics Association, Goslar Germany, Germany, 19–22.
Zvika Ben-Haim, Yonina C. Eldar, and Michael Elad. 2010. Coherence-based performance guarantees for estimating a sparse vector under random noise. Trans. Sig. Proc. 58, 10 (Oct. 2010), 5030–5043.
Ahmet Bilgili, Aydn Öztürk, and Murat Kurt. 2011. A general BRDF representation based on tensor decomposition. Comput. Graph. Forum 30, 8 (2011), 2427–2439.
Ori Bryt and Michael Elad. 2008. Compression of facial images using the K-SVD algorithm. J. Vis. Comun. Image Represent. 19, 4 (2008), 270–282.
E. J. Candès, J. Romberg, and T. Tao. 2006. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theor. 52, 2 (Feb. 2006), 489–509.
E. J. Candès, J. K. Romberg, and T. Tao. 2006. Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59, 8 (2006), 1207–1223.
E. J. Candès and T. Tao. 2005. Decoding by linear programming. IEEE Trans. Inform. Theor. 51, 12 (Dec. 2005), 4203–4215.
E. J. Candès and T. Tao. 2006. Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inform. Theor. 52, 12 (Dec. 2006), 5406–5425.
E. J. Candès and M. B. Wakin. 2008. An introduction to compressive sampling. IEEE Sig. Proc. Mag. 25, 2 (March 2008), 21–30.
L. H. Chang and J. Y. Wu. 2014. An improved RIP-based performance guarantee for sparse signal recovery via orthogonal matching pursuit. IEEE Trans. Inform. Theor. 60, 9 (Sept. 2014), 5702–5715.
B. Choudhury, R. Swanson, F. Heide, G. Wetzstein, and W. Heidrich. 2017. Consensus convolutional sparse coding. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 4290–4298.
A. Cohen, Ingrid Daubechies, and J.-C. Feauveau. 1992. Biorthogonal bases of compactly supported wavelets. Commun. Pure Appl. Math. 45, 5 (June 1992), 485–560.
Kristin J. Dana, Bram van Ginneken, Shree K. Nayar, and Jan J. Koenderink. 1999. Reflectance and texture of real-world surfaces. ACM Trans. Graph. 18, 1 (Jan. 1999), 1–34.
D. L. Donoho. 2006. Compressed sensing. IEEE Trans. Inform. Theor. 52, 4 (Apr. 2006), 1289–1306.
David L. Donoho and Michael Elad. 2003. Optimally sparse representation in general (nonorthogonal) dictionaries via &ell;1 minimization. In Proceedings of the National Academy of Sciences 100, 5 (2003), 2197–2202.
P. L. Dragotti and Y. M. Lu. 2014. On sparse representation in Fourier and local bases. IEEE Trans. Inform. Theor. 60, 12 (Dec. 2014), 7888–7899.
Michael Elad. 2010. Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing (1st ed.). Springer-Verlag New York.
M. Elad and M. Aharon. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Proc. 15, 12 (Dec. 2006), 3736–3745.
Y. C. Eldar, P. Kuppinger, and H. Bolcskei. 2010. Block-sparse signals: Uncertainty relations and efficient recovery. IEEE Trans. Sig. Proc. 58, 6 (2010), 3042–3054.
E. Elhamifar and R. Vidal. 2013. Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35, 11 (Nov. 2013), 2765–2781.
Walter D. Fisher. 1958. On grouping for maximum homogeneity. J. Amer. Statist. Assoc. 53, 284 (1958), 789–798.
Steven J. Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F. Cohen. 1996. The Lumigraph. In Proceedings of the 23rd Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’96). ACM, New York, NY, 43–54.
R. Gribonval and M. Nielsen. 2003. Sparse representations in unions of bases. IEEE Trans. Inform. Theor. 49, 12 (Dec. 2003), 3320–3325.
K. S. Gurumoorthy, A. Rajwade, A. Banerjee, and A. Rangarajan. 2010. A method for compact image representation using sparse matrix and tensor projections onto exemplar orthonormal bases. IEEE Trans. Image Proc. 19, 2 (2010), 322–334.
Michael Guthe, Gero Müller, Martin Schneider, and Reinhard Klein. 2009. BTF-CIELab: A perceptual difference measure for quality assessment and compression of BTFs. Comput. Graphics Forum 28, 1 (2009), 101–113.
R. A. Horn and C. R. Johnson. 2012. Matrix Analysis. Cambridge University Press.
David A. Huffman. 1952. A method for the construction of minimum-redundancy codes. Proceedings of the IRE 40, 9 (Sept. 1952), 1098–1101.
Adrian Jarabo, Belen Masia, Adrien Bousseau, Fabio Pellacini, and Diego Gutierrez. 2014. How do people edit light fields? ACM Trans. Graph. 33, 4, Article 146 (July 2014), 10 pages.
A. Jones, K. Nagano, J. Busch, X. Yu, H. Y. Peng, J. Barreto, O. Alexander, M. Bolas, P. Debevec, and J. Unger. 2016. Time-offset conversations on a life-sized automultiscopic projector array. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’16). IEEE, 927–935.
Nima Khademi Kalantari, Ting-Chun Wang, and Ravi Ramamoorthi. 2016. Learning-based view synthesis for light field cameras. ACM Trans. Graph. 35, 6, Article 193 (Nov. 2016), 10 pages.
Mahdad Hosseini Kamal, Barmak Heshmat, Ramesh Raskar, Pierre Vandergheynst, and Gordon Wetzstein. 2016. Tensor low-rank and sparse light field photography. Comput. Vis. Image Und. 145 (2016), 172–181.
David Kittle, Kerkil Choi, Ashwin Wagadarikar, and David J. Brady. 2010. Multiframe image estimation for coded aperture snapshot spectral imagers. Appl. Opt. 49, 36 (Dec. 2010), 6824–6833.
Seungjae Lee, Changwon Jang, Seokil Moon, Jaebum Cho, and Byoungho Lee. 2016. Additive light field displays: Realization of augmented reality with holographic optical elements. ACM Trans. Graph. 35, 4, Article 60 (July 2016), 13 pages.
Marc Levoy and Pat Hanrahan. 1996. Light field rendering. In Proceedings of the 23rd Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’96). ACM, New York, NY, 31–42.
Xinguo Liu, Peter-Pike Sloan, Heung-Yeung Shum, and John Snyder. 2004. All-frequency precomputed radiance transfer for glossy objects. In Proceedings of the Eurographics Conference on Rendering Techniques (EGSR’04). Eurographics Association, 337–344.
S. Lloyd. 1982. Least squares quantization in PCM. IEEE Trans. Inform. Theor. 28, 2 (1982), 129–137.
Dhruv Mahajan, Ira Kemelmacher Shlizerman, Ravi Ramamoorthi, and Peter Belhumeur. 2007. A theory of locally low dimensional light transport. ACM Trans. Graph. 26, 3 (July 2007).
Andrew Maimone, Gordon Wetzstein, Matthew Hirsch, Douglas Lanman, Ramesh Raskar, and Henry Fuchs. 2013. Focus 3D: Compressive accommodation display. ACM Trans. Graph. 32, 5, Article 153 (Oct. 2013), 13 pages.
J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. 2009. Non-local sparse models for image restoration. In Proceedings of the 12th IEEE International Conference on Computer Vision. IEEE, 2272–2279.
S. Mallat. 2008. A Wavelet Tour of Signal Processing: The Sparse Way. Elsevier Science.
Kshitij Marwah, Gordon Wetzstein, Yosuke Bando, and Ramesh Raskar. 2013. Compressive light field photography using overcomplete dictionaries and optimized projections. ACM Trans. Graph. 32, 4, Article 46 (July 2013), 12 pages.
Ehsan Miandji. 2018. Sparse Representation of Visual Data for Compression and Compressed Sensing. Ph.D. Dissertation. Department of Science and Technology, Linköping University, Sweden.
Ehsan Miandji, Mohammad Emadi, Jonas Unger, and Ehsan Afshari. 2017. On probability of support recovery for orthogonal matching pursuit using mutual coherence. IEEE Sig. Proc. Lett. 24, 11 (Nov. 2017), 1646–1650.
Ehsan Miandji, Joel Kronander, and Jonas Unger. 2013. Learning based compression of surface light fields for real-time rendering of global illumination scenes. In SIGGRAPH Asia 2013 Technical Briefs (SA’13). ACM, New York, NY, Article 24, 4 pages.
Ehsan Miandji, Joel Kronander, and Jonas Unger. 2015. Compressive image reconstruction in reduced union of subspaces. Comput. Graph. Forum 34, 2 (May 2015), 33–44.
Ehsan Miandji, Jonas Unger, and Christine Guillemot. 2018. Multi-shot single sensor light field camera using a color coded mask. In Proceedings of the 26th European Signal Processing Conference (EUSIPCO’18). 1–5.
H. Mohimani, Massoud Babaie-Zadeh, and C. Jutten. 2009. A fast approach for overcomplete sparse decomposition based on smoothed &ell;0 norm. IEEE Trans. Sig. Proc. 57, 1 (2009), 289–301.
Gero Müller, Jan Meseth, Mirko Sattler, Ralf Sarlette, and Reinhard Klein. 2005. Acquisition, synthesis, and rendering of bidirectional texture functions. Computer Graphics Forum 24, 1 (Mar. 2005), 83–109.
Ren Ng, Marc Levoy, Mathieu Brédif, Gene Duval, Mark Horowitz, and Pat Hanrahan. 2005. Light field photography with a hand-held plenoptic camera. Comput. Sci. Tech. Rep. 2, 11 (2005), 1–11.
H. Nyquist. 1928. Certain topics in telegraph transmission theory. Trans. Amer. Inst. Elect. Eng. 47, 2 (Apr. 1928), 617–644.
Renato Pajarola, Susanne K. Suter, and Roland Ruiters. 2013. Tensor approximation in visualization and computer graphics. In Eurographics 2013—Tutorials. Eurographics Association, Girona, Spain.
Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad. 1993. Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Conference Record of the Twenty-Seventh Asilomar Conference on Signals, Systems and Computers, Vol. 1. IEEE, 40–44.
M. Rahmani and G. K. Atia. 2017. Coherence pursuit: Fast, simple, and robust principal component analysis. IEEE Trans. Sig. Proc. 65, 23 (Dec. 2017), 6260–6275.
Ruiters Roland and Klein Reinhard. 2009. BTF compression via sparse tensor decomposition. Comput. Graph. Forum 28, 4 (2009), 1181–1188.
R. Rubinstein, M. Zibulevsky, and M. Elad. 2010. Double sparsity: Learning sparse dictionaries for sparse signal approximation. IEEE Trans. Sig. Proc. 58, 3 (2010), 1553–1564.
Neus Sabater, Guillaume Boisson, Benoit Vandame, Paul Kerbiriou, Frederic Babon, Matthieu Hog, Tristan Langlois, Remy Gendrot, Olivier Bureller, Arno Schubert, and Valerie Allie. 2017. Dataset and pipeline for multi-view light-field video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 1743–1753.
Peter H. Schönemann. 1966. A generalized solution of the orthogonal procrustes problem. Psychometrika 31, 1 (1966), 1–10.
Kai Schröder, Reinhard Klein, and Arno Zinke. 2013. Non-local image reconstruction for efficient computation of synthetic bidirectional texture functions. Comput. Graph. Forum 32 (2013), 61–71.
C. E. Shannon. 1949. Communication in the presence of noise. Proceedings of the IRE 37, 1 (Jan. 1949), 10–21.
Peter-Pike Sloan, Jesse Hall, John Hart, and John Snyder. 2003. Clustered principal components for precomputed radiance transfer. ACM Trans. Graph. 22, 3 (2003), 382–391.
Peter-Pike Sloan, Jan Kautz, and John Snyder. 2002. Precomputed radiance transfer for real-time rendering in dynamic, low-frequency lighting environments. ACM Trans. Graph. 21, 3 (July 2002), 527–536.
Mahdi Soltanolkotabi, Emmanuel J. Candès, et al. 2012. A geometric analysis of subspace clustering with outliers. Ann. Statist. 40, 4 (2012), 2195–2238.
David Taubman and Michael Marcellin. 2013. JPEG2000 Image Compression Fundamentals, Standards and Practice. Springer Publishing Company, Incorporated.
J. A. Tropp. 2004. Greed is good: Algorithmic results for sparse approximation. IEEE Trans. Inform. Theor. 50, 10 (Oct. 2004), 2231–2242.
Yu-Ting Tsai. 2015. Multiway K-clustered tensor approximation: Toward high-performance photorealistic data-driven rendering. ACM Trans. Graph. 34, 5, Article 157 (Nov. 2015), 15 pages.
Yu-Ting Tsai and Zen-Chung Shih. 2012. K-clustered tensor approximation: A sparse multilinear model for real-time rendering. ACM Trans. Graph. 31, 3, Article 19 (June 2012), 17 pages.
Vaibhav Vaish, Marc Levoy, Richard Szeliski, C. L. Zitnick, and Sing Bing Kang. 2006. Reconstructing occluded surfaces using synthetic apertures: Stereo, focus and robust measures. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2331–2338.
M. Alex O. Vasilescu and Demetri Terzopoulos. 2004. TensorTextures: Multilinear image-based rendering. ACM Trans. Graph. 23, 3 (Aug. 2004), 336–342.
Hongcheng Wang, Qing Wu, Lin Shi, Yizhou Yu, and Narendra Ahuja. 2005. Out-of-core tensor approximation of multi-dimensional matrices of visual data. ACM Trans. Graph. 24, 3 (July 2005), 527–535.
Y. Wang, F. Liu, K. Zhang, G. Hou, Z. Sun, and T. Tan. 2018. LFNet: A novel bidirectional recurrent convolutional neural network for light-field image super-resolution. IEEE Trans. Image Proc. 27, 9 (Sept. 2018), 4274–4286.
Zhou Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Proc. 13, 4 (April 2004), 600–612.
Gordon Wetzstein, Douglas Lanman, Matthew Hirsch, Wolfgang Heidrich, and Ramesh Raskar. 2012b. Compressive light field displays. IEEE Comput. Graph. Applicat. 32, 5 (2012), 6–11.
Gordon Wetzstein, Douglas Lanman, Matthew Hirsch, and Ramesh Raskar. 2012a. Tensor displays: Compressive light field synthesis using multilayer displays with directional backlighting. ACM Trans. Graph. 31, 4, Article 80 (July 2012), 11 pages.
G. Wu, B. Masia, A. Jarabo, Y. Zhang, L. Wang, Q. Dai, T. Chai, and Y. Liu. 2017a. Light field image processing: An overview. IEEE J. Select. Topics Sig. Proc. 11, 7 (Oct. 2017), 926–954.
G. Wu, M. Zhao, L. Wang, Q. Dai, T. Chai, and Y. Liu. 2017b. Light field reconstruction using deep convolutional network on EPI. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 1638–1646.
H. Xu, C. Caramanis, and S. Mannor. 2013. Outlier-robust PCA: The high-dimensional case. IEEE Trans. Inform. Theor. 59, 1 (Jan. 2013), 546–572.
Y. Yoon, H. Jeon, D. Yoo, J. Lee, and I. S. Kweon. 2015. Learning a deep convolutional network for light-field image super-resolution. In Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW’15). 57–65.
Y. Yoon, H. Jeon, D. Yoo, J. Lee, and I. S. Kweon. 2017. Light-field image super-resolution using convolutional neural network. IEEE Sig. Proc. Lett. 24, 6 (June 2017), 848–852.
J. Zepeda, C. Guillemot, and E. Kijak. 2011. Image compression using sparse representations and the iteration-tuned and aligned dictionary. IEEE J. Select. Topics Sig. Proc. 5, 5 (Sept. 2011), 1061–1073.
Richard Zhang, Phillip Isola, Alexei Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 586–595.
Zhiming Zhou, Guojun Chen, Yue Dong, David Wipf, Yong Yu, John Snyder, and Xin Tong. 2016. Sparse-as-possible SVBRDF acquisition. ACM Trans. Graph. 35, 6, Article 189 (Nov. 2016), 12 pages.

ACM Digital Library Publication: