“JumpCut: non-successive mask transfer and interpolation for video cutout” by Fan, Zhong, Lischinski, Cohen-Or and Chen
Conference:
Type(s):
Title:
- JumpCut: non-successive mask transfer and interpolation for video cutout
Session/Category Title: Video Processing
Presenter(s)/Author(s):
Abstract:
We introduce JumpCut, a new mask transfer and interpolation method for interactive video cutout. Given a source frame for which a foreground mask is already available, we compute an estimate of the foreground mask at another, typically non-successive, target frame. Observing that the background and foreground regions typically exhibit different motions, we leverage these differences by computing two separate nearest-neighbor fields (split-NNF) from the target to the source frame. These NNFs are then used to jointly predict a coherent labeling of the pixels in the target frame. The same split-NNF is also used to aid a novel edge classifier in detecting silhouette edges (S-edges) that separate the foreground from the background. A modified level set method is then applied to produce a clean mask, based on the pixel labels and the S-edges computed by the previous two steps. The resulting mask transfer method may also be used for coherently interpolating the foreground masks between two distant source frames. Our results demonstrate that the proposed method is significantly more accurate than the existing state-of-the-art on a wide variety of video sequences. Thus, it reduces the required amount of user effort, and provides a basis for an effective interactive video object cutout tool.
References:
1. Agarwala, A., Hertzmann, A., Salesin, D. H., and Seitz, S. M. 2004. Keyframe-based tracking for rotoscoping and animation. ACM Trans. Graph. 23, 3 (Aug.), 584–591.
2. Bai, X., Wang, J., Simons, D., and Sapiro, G. 2009. Video SnapCut: Robust video object cutout using localized classifiers. ACM Trans. Graph. 28, 3 (July), 70:1–11.
3. Bai, X., Wang, J., and Sapiro, G. 2010. Dynamic color flow: A motion-adaptive color model for object segmentation in video. In Proc. ECCV, Springer-Verlag, vol. V, 617–630.
4. Bao, L., Yang, Q., and Jin, H. 2014. Fast edge-preserving patchmatch for large displacement optical flow. IEEE Trans. Image Processing 23, 12 (Dec.), 4996–5006.
5. Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph 28, 3, 24.
6. Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. 2008. Speeded-up robust features (SURF). Computer Vision and Image Understanding 110, 3, 346–359.
7. Brox, T., Bruhn, A., Papenberg, N., and Weickert, J. 2004. High accuracy optical flow estimation based on a theory for warping. In Proc. ECCV’04. Springer, 25–36.
8. Caselles, V., Kimmel, R., and Sapiro, G. 1997. Geodesic active contours. Intl. J. Comp. Vision 22, 1, 61–79.
9. Chen, Z., Jin, H., Lin, Z., Cohen, S., and Wu, Y. 2013. Large displacement optical flow from nearest neighbor fields. In Proc. CVPR’13, 2443–2450.
10. Chuang, Y.-Y., Curless, B., Salesin, D. H., and Szeliski, R. 2001. A bayesian approach to digital matting. In Proc. CVPR 2001, IEEE Computer Society, vol. 2, 264–271.
11. Chuang, Y.-Y., Agarwala, A., Curless, B., Salesin, D. H., and Szeliski, R. 2002. Video matting of complex scenes. ACM Trans. Graph. 21, 3 (July), 243–248.
12. Cremers, D., Rousson, M., and Deriche, R. 2007. A review of statistical approaches to level set segmentation: Integrating color, texture, motion and shape. Intl. J. Comp. Vision 72, 2, 195–215.
13. Dollár, P., and Zitnick, C. L. 2013. Structured forests for fast edge detection. In Proc. ICCV, IEEE, 1841–1848.
14. Faktor, A., and Irani, M. 2014. Video segmentation by non-local consensus voting. In Proc. BMVC 2014.
15. Li, Y., Sun, J., and Shum, H.-Y. 2005. Video object cut and paste. ACM Trans. Graph. 24, 3 (July), 595–600.
16. Ling, H., and Jacobs, D. W. 2007. Shape classification using the inner-distance. IEEE Trans. PAMI 29, 2, 286–299.
17. Liu, Y., and Yu, Y. 2012. Interactive image segmentation based on level sets of probabilities. IEEE Trans. Vis. Comp. Graphics 18, 2 (Feb), 202–213.
18. Muja, M., and Lowe, D. G. 2009. Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Applications (VISAPP’09).
19. Osher, S. J., and Fedkiw, R. P. 2003. Level Set Methods and Dynamic Implicit Surfaces, 1st ed. Springer-Verlag.
20. Price, B., Morse, B., and Cohen, S. 2009. LIVEcut: Learning-based interactive video segmentation by evaluation of multiple propagated cues. In Proc. ICCV, 779–786.
21. Ramakanth, S. A., and Babu, R. V. 2014. SeamSeg: Video object segmentation using patch seams. In Proc. CVPR’14.
22. Rubin, E. 2001. Figure and ground. In Visual Perception, S. Yantis, Ed. Psychology Press, Philadelphia, 225–229.
23. Schaefer, S., McPhail, T., and Warren, J. 2006. Image deformation using moving least squares. ACM Trans. Graph. 25, 3, 533–540.
24. Smith, A. R., and Blinn, J. F. 1996. Blue screen matting. In Proc. SIGGRAPH ’96, ACM, 259–268.
25. Tong, R.-F., Zhang, Y., and Ding, M. 2011. Video brush: A novel interface for efficient video cutout. Computer Graphics Forum 30, 7, 2049–2057.
26. Tsai, D., Flagg, M., and Rehg, J. M. 2010. Motion coherent tracking with multi-label MRF optimization. In Proc. BMVC.
27. Wang, J., Bhat, P., Colburn, R. A., Agrawala, M., and Cohen, M. F. 2005. Interactive video cutout. ACM Trans. Graph. 24, 3 (July), 585–594.
28. Wang, T., Han, B., and Collomosse, J. 2014. TouchCut: Fast image and video segmentation using single-touch interaction. Comp. Vis. Im. Understanding 120, 14–30.
29. Weinzaepfel, P., Revaud, J., Harchaoui, Z., and Schmid, C. 2013. DeepFlow: Large displacement optical flow with deep matching. In Proc. ICCV, IEEE, 1385–1392.
30. Zhong, F., Qin, X., Peng, Q., and Meng, X. 2012. Discontinuity-aware video object cutout. ACM Trans. Graph. (SIGGRAPH Asia 2012) 31, 6 (November), 175:1–10.


