“Video matting of complex scenes”

  • ©Yung-Yu Chuang, Aseem Agarwala, Brian Curless, David H. Salesin, and Richard Szeliski




    Video matting of complex scenes



    This paper describes a new framework for video matting, the process of pulling a high-quality alpha matte and foreground from a video sequence. The framework builds upon techniques in natural image matting, optical flow computation, and background estimation. User interaction is comprised of garbage matte specification if background estimation is needed, and hand-drawn keyframe segmentations into “foreground,” “background” and “unknown”. The segmentations, called trimaps, are interpolated across the video volume using forward and backward optical flow. Competing flow estimates are combined based on information about where flow is likely to be accurate. A Bayesian matting technique uses the flowed trimaps to yield high-quality mattes of moving foreground elements with complex boundaries filmed by a moving camera. A novel technique for smoke matte extraction is also demonstrated.


    1. BARRON, J. L., FLEET, D. J., AND BEAUCHEMIN, S. S. 1994. Performance of optical flow techniques. International Journal of Computer Vision 12, 1, 43-77. Google Scholar
    2. BERMAN, A., DADOURIAN, A., AND VLAHOS, P., 2000. Method for removing from an image the background surrounding a selected object. U.S. Patent 6,134,346.Google Scholar
    3. BLACK, M. J., AND ANANDAN, P. 1996. The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields. Computer Vision and Image Understanding 63, 1, 75-104. Google Scholar
    4. BLAKE, A., AND ISARD, M. 1998. Active Contours. Springer Verlag, London.Google Scholar
    5. CHUANG, Y.-Y., CURLESS, B., SALESIN, D., AND SZELISKI, R. 2001. A Bayesian approach to digital matting. In Proceedings of Computer Vision and Pattern Recognition (CVPR 2001), vol. II, 264 – 271. Google Scholar
    6. GLEICHER, M. 1995. Image snapping. In Proceedings of ACM SIGGRAPH 95, 183-190. Google Scholar
    7. HILLMAN, P., HANNAH, J., AND RENSHAW, D. 2001. Alpha channel estimation in high resolution images and image sequences. In Proceedings of Computer Vision and Pattern Recognition (CVPR 2001), vol. I, 1063-1068.Google Scholar
    8. KELLY, D. 2000. Digital Composition. The Coriolis Group.Google Scholar
    9. LEE, M.-C., ET AL. 1997. A layered video object coding system using sprite and affine motion model. lEEE Transactions on Circuits and Systems for Video Technology 7, 1, 130-145. Google Scholar
    10. MITSUNAGA, T., YOKOYAMA, T., AND TOTSUKA, T. 1995. Autokey: Human assisted key extraction. In Proceedings of ACM SIGGRAPH 95, 265-272. Google Scholar
    11. MORTENSEN, E. N., AND BARRETT, W. A. 1995. Intelligent scissors for image composition. In Proceedings of ACM SIGGRAPH 95, 191-198. Google Scholar
    12. PORTER, T., AND DUFF, T. 1984. Compositing digital images. In Computer Graphics (Proceedings of ACM SIGGRAPH 84), vol. 18, 253-259. Google Scholar
    13. RUZON, M. A., AND TOMASI, C. 2000. Alpha estimation in natural images. In Proceedings of Computer Vision and Pattern Recognition (CVPR 2000), 18-25.Google Scholar
    14. SMITH, A. R., AND BLINN, J. F. 1996. Blue screen matting. In Proceedings of ACM SIGGRAPH 96, 259-268. Google Scholar
    15. SUN, S., HAYNOR, D., AND KIM, Y. 2000. Motion estimation based on optical flow with adaptive gradients. In Proceedings of International Conference on Image Processing (ICIP 2000), vol. I, 852-855.Google Scholar
    16. SZELISKI, R., AND SHUM, H.-Y. 1997. Creating full view panoramic mosaics and environment maps. In Proceedings of ACM SIGGRAPH 97, 251-258. Google Scholar
    17. WANG, J. Y. A., AND ADELSON, E. H. 1994. Representing moving images with layers. IEEE Transactions on Image Processing 3, 5, 625-638.Google Scholar

ACM Digital Library Publication:

Overview Page: