“Live intrinsic video” by Meka, Zollhoefer and Richardt

  • ©Abhimitra Meka, Michael Zollhoefer, Christian Richardt, and Christian Theobalt



Session Title:



    Live intrinsic video




    Intrinsic video decomposition refers to the fundamentally ambiguous task of separating a video stream into its constituent layers, in particular reflectance and shading layers. Such a decomposition is the basis for a variety of video manipulation applications, such as realistic recoloring or retexturing of objects. We present a novel variational approach to tackle this underconstrained inverse problem at real-time frame rates, which enables on-line processing of live video footage. The problem of finding the intrinsic decomposition is formulated as a mixed variational ℓ2-ℓp-optimization problem based on an objective function that is specifically tailored for fast optimization. To this end, we propose a novel combination of sophisticated local spatial and global spatio-temporal priors resulting in temporally coherent decompositions at real-time frame rates without the need for explicit correspondence search. We tackle the resulting high-dimensional, non-convex optimization problem via a novel data-parallel iteratively reweighted least squares solver that runs on commodity graphics hardware. Real-time performance is obtained by combining a local-global solution strategy with hierarchical coarse-to-fine optimization. Compelling real-time augmented reality applications, such as recoloring, material editing and retexturing, are demonstrated in a live setup. Our qualitative and quantitative evaluation shows that we obtain high-quality real-time decompositions even for challenging sequences. Our method is able to outperform state-of-the-art approaches in terms of runtime and result quality — even without user guidance such as scribbles.


    1. Barron, J. T., and Malik, J. 2013. Intrinsic scene properties from a single RGB-D image. In CVPR. Google ScholarDigital Library
    2. Barron, J. T., and Malik, J. 2015. Shape, illumination, and reflectance from shading. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 8, 1670–1687.Google ScholarDigital Library
    3. Barrow, H. G., and Tenenbaum, J. M. 1978. Recovering intrinsic scene characteristics from images. Tech. Rep. 157, AI Center, SRI International.Google Scholar
    4. Bell, M., and Freeman, W. T. 2001. Learning local evidence for shading and reflection. In CVPR.Google Scholar
    5. Bell, S., Bala, K., and Snavely, N. 2014. Intrinsic images in the wild. ACM Transactions on Graphics 33, 4 (July), 159:1–12. Google ScholarDigital Library
    6. Bi, S., Han, X., and Yu, Y. 2015. An l1 image transform for edge-preserving smoothing and scene-level intrinsic decomposition. ACM Transactions on Graphics 34, 4 (July), 78:1–12. Google ScholarDigital Library
    7. Bonneel, N., Sunkavalli, K., Tompkin, J., Sun, D., Paris, S., and Pfister, H. 2014. Interactive intrinsic video editing. ACM Transactions on Graphics 33, 6 (November), 197:1–10. Google ScholarDigital Library
    8. Bonneel, N., Tompkin, J., Sunkavalli, K., Sun, D., Paris, S., and Pfister, H. 2015. Blind video temporal consistency. ACM Transactions on Graphics 34, 6 (November), 196:1–9. Google ScholarDigital Library
    9. Bousseau, A., Paris, S., and Durand, F. 2009. User-assisted intrinsic images. ACM Transactions on Graphics 28, 5 (December), 130:1–10. Google ScholarDigital Library
    10. Chang, J., Cabezas, R., and Fisher III, J. W. 2014. Bayesian nonparametric intrinsic image decomposition. In ECCV.Google Scholar
    11. Chen, Q., and Koltun, V. 2013. A simple model for intrinsic image decomposition with depth cues. In ICCV. Google ScholarDigital Library
    12. Duchêne, S., Riant, C., Chaurasia, G., Moreno, J. L., Laffont, P.-Y., Popov, S., Bousseau, A., and Drettakis, G. 2015. Multiview intrinsic images of outdoors scenes with an application to relighting. ACM Transactions on Graphics 34, 5 (October), 164:1–16. Google ScholarDigital Library
    13. Garces, E., Munoz, A., Lopez-Moreno, J., and Gutierrez, D. 2012. Intrinsic images by clustering. CGF 31, 4, 1415–1424. Google ScholarDigital Library
    14. Gehler, P. V., Rother, C., Kiefel, M., Zhang, L., and Schölkopf, B. 2011. Recovering intrinsic images with a global sparsity prior on reflectance. In NIPS.Google Scholar
    15. Grosse, R., Johnson, M. K., Adelson, E. H., and Freeman, W. T. 2009. Ground truth dataset and baseline evaluations for intrinsic image algorithms. In ICCV.Google Scholar
    16. Hachama, M., Ghanem, B., and Wonka, P. 2015. Intrinsic scene decomposition from RGB-D images. In ICCV. Google ScholarDigital Library
    17. Hauagge, D., Wehrwein, S., Bala, K., and Snavely, N. 2013. Photometric ambient occlusion. In CVPR. Google ScholarDigital Library
    18. Holland, P. W., and Welsch, R. E. 1977. Robust regression using iteratively reweighted least-squares. Communications in Statistics — Theory and Methods 6, 9 (September), 813–827.Google Scholar
    19. Horn, B. K. P. 1974. Determining lightness from an image. Computer Graphics and Image Processing 3, 4, 277–299.Google ScholarCross Ref
    20. Jiang, X., Schofield, A. J., and Wyatt, J. L. 2010. Correlation-based intrinsic image extraction from a single image. In ECCV. Google ScholarDigital Library
    21. Joshi, N., Zitnick, C., Szeliski, R., and Kriegman, D. 2009. Image deblurring and denoising using color priors. In CVPR.Google Scholar
    22. Klein, G., and Murray, D. 2007. Parallel tracking and mapping for small AR workspaces. In ISMAR. Google ScholarDigital Library
    23. Kong, N., Gehler, P. V., and Black, M. J. 2014. Intrinsic video. In ECCV.Google Scholar
    24. Laffont, P.-Y., and Bazin, J.-C. 2015. Intrinsic decomposition of image sequences from local temporal variations. In ICCV. Google ScholarDigital Library
    25. Laffont, P.-Y., Bousseau, A., Paris, S., Durand, F., and Drettakis, G. 2012. Coherent intrinsic images from photo collections. ACM Transactions on Graphics 31, 6 (November), 202:1–11. Google ScholarDigital Library
    26. Laffont, P.-Y., Bousseau, A., and Drettakis, G. 2013. Rich intrinsic image decomposition of outdoor scenes from multiple views. IEEE Transactions on Visualization and Computer Graphics 19, 2 (February), 210–224. Google ScholarDigital Library
    27. Land, E. H., and McCann, J. J. 1971. Lightness and retinex theory. Journal of the Optical Society of America 61, 1, 1–11.Google ScholarCross Ref
    28. Lee, K. J., Zhao, Q., Tong, X., Gong, M., Izadi, S., Lee, S. U., Tan, P., and Lin, S. 2012. Estimation of intrinsic image sequences from image+depth video. In ECCV. Google ScholarDigital Library
    29. Levin, A., and Weiss, Y. 2007. User assisted separation of reflections from a single image using a sparsity prior. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 9 (September), 1647–1654. Google ScholarDigital Library
    30. Levin, A., Fergus, R., Durand, F., and Freeman, W. T. 2007. Image and depth from a conventional camera with a coded aperture. ACM Transactions on Graphics 26, 3 (July), 70. Google ScholarDigital Library
    31. Li, Y., and Brown, M. S. 2014. Single image layer separation using relative smoothness. In CVPR. Google ScholarDigital Library
    32. Matsushita, Y., Lin, S., Kang, S., and Shum, H.-Y. 2004. Estimating intrinsic images from image sequences with biased illumination. In ECCV.Google Scholar
    33. Shen, L., and Yeo, C. 2011. Intrinsic images decomposition using a local and global sparse representation of reflectance. In CVPR.Google Scholar
    34. Shen, L., Tan, P., and Lin, S. 2008. Intrinsic image decomposition with non-local texture cues. In CVPR.Google Scholar
    35. Shen, J., Yang, X., Jia, Y., and Li, X. 2011. Intrinsic images using optimization. In CVPR.Google Scholar
    36. Shen, J., Yan, X., Chen, L., Sun, H., and Li, X. 2014. Re-texturing by intrinsic video. Information Sciences 281, 726–735. Google ScholarDigital Library
    37. Tappen, M. F., Freeman, W. T., and Adelson, E. H. 2005. Recovering intrinsic images from a single image. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 9, 1459–1472. Google ScholarDigital Library
    38. Weber, D., Bender, J., Schnoes, M., Stork, A., and Fellner, D. 2013. Efficient GPU data structures and methods to solve sparse linear systems in dynamics applications. Computer Graphics Forum 32, 1, 16–26.Google ScholarCross Ref
    39. Weiss, Y. 2001. Deriving intrinsic images from image sequences. In ICCV.Google Scholar
    40. Winnemöller, H., Olsen, S. C., and Gooch, B. 2006. Realtime video abstraction. ACM Transactions on Graphics 25, 3 (July), 1221–1226. Google ScholarDigital Library
    41. Wu, C., Zollhöfer, M., Niessner, M., Stamminger, M., Izadi, S., and Theobalt, C. 2014. Real-time shading-based refinement for consumer depth cameras. ACM Transactions on Graphics 33, 6 (November), 200:1–10. Google ScholarDigital Library
    42. Ye, G., Garces, E., Liu, Y., Dai, Q., and Gutierrez, D. 2014. Intrinsic video and applications. ACM Transactions on Graphics 33, 4 (July), 80:1–11. Google ScholarDigital Library
    43. Zhao, Q., Tan, P., Dai, Q., Shen, L., Wu, E., and Lin, S. 2012. A closed-form solution to Retinex with nonlocal texture constraints. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 7 (July), 1437–1444. Google ScholarDigital Library
    44. Zhao, H. K. 1996. Generalized Schwarz Alternating Procedure for Domain Decomposition. University of California, Los Angeles.Google Scholar
    45. Zhou, T., Krähenbühl, P., and Efros, A. 2015. Learning data-driven reflectance priors for intrinsic image decomposition. In ICCV. Google ScholarDigital Library
    46. Zollhöfer, M., Niessner, M., Izadi, S., Rhemann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., and Stamminger, M. 2014. Real-time non-rigid reconstruction using an RGB-D camera. ACM Transactions on Graphics 33, 4 (July), 156:1–12. Google ScholarDigital Library
    47. Zollhöfer, M., Dai, A., Innmann, M., Wu, C., Stamminger, M., Theobalt, C., and Niessner, M. 2015. Shading-based refinement on volumetric signed distance functions. ACM Transactions on Graphics 34, 4 (July), 96:1–14. Google ScholarDigital Library
    48. Zoran, D., Isola, P., Krishnan, D., and Freeman, W. T. 2015. Learning ordinal relationships for mid-level vision. In ICCV. Google ScholarDigital Library

ACM Digital Library Publication: