Motion-based video retargeting with optimized crop-and-warp

Yu-Shuen Wang; Hui-Chih Lin; Olga Sorkine-Hornung; Tong-Yee Lee

“Motion-based video retargeting with optimized crop-and-warp” by Wang, Lin, Sorkine-Hornung and Lee

Next: “Motion-driven concatenative synthesis of cloth... »

« Previous: “Motion-Attentive Network for Detecting Abnormal...

Conference:

SIGGRAPH 2010

Type:

Technical Papers

Title:

Motion-based video retargeting with optimized crop-and-warp

Presenter(s)/Author(s):

Yu-Shuen Wang

Hui-Chih Lin

Olga Sorkine-Hornung

Tong-Yee Lee

Abstract:

We introduce a video retargeting method that achieves high-quality resizing to arbitrary aspect ratios for complex videos containing diverse camera and dynamic motions. Previous content-aware retargeting methods mostly concentrated on spatial considerations, attempting to preserve the shape of salient objects in each frame by removing or distorting homogeneous background content. However, sacrificeable space is fundamentally limited in video, since object motion makes foreground and background regions correlated, causing waving and squeezing artifacts. We solve the retargeting problem by explicitly employing motion information and by distributing distortion in both spatial and temporal dimensions. We combine novel cropping and warping operators, where the cropping removes temporally-recurring contents and the warping utilizes available homogeneous regions to mask deformations while preserving motion. Variational optimization allows to find the best balance between the two operations, enabling retargeting of challenging videos with complex motions, numerous prominent objects and arbitrary depth variability. Our method compares favorably with state-of-the-art retargeting systems, as demonstrated in the examples and widely supported by the conducted user study.

References:

1. Avidan, S., and Shamir, A. 2007. Seam carving for contentaware image resizing. ACM Trans. Graph. 26, 3, 10. Google ScholarDigital Library
2. Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. B. 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 3. Google ScholarDigital Library
3. Buatois, L., Caumon, G., and Lévy, B. 2009. Concurrent number cruncher: a GPU implementation of a general sparse linear solver. Int. J. Parallel Emerg. Distrib. Syst. 24, 3, 205–223. Google ScholarDigital Library
4. Chen, L. Q., Xie, X., Fan, X., Ma, W. Y., Zhang, H. J., and Zhou, H. Q. 2003. A visual attention model for adapting images on small displays. ACM Multimedia Systems Journal 9, 4, 353–364.Google ScholarDigital Library
5. Cho, T. S., Butman, M., Avidan, S., and Freeman, W. T. 2008. The patch transform and its applications to image editing. In CVPR ’08.Google Scholar
6. David, H. A. 1963. The Method of Paired Comparisons. Charles Griffin & Company.Google Scholar
7. Deselaers, T., Dreuw, P., and Ney, H. 2008. Pan, zoom, scan: Time-coherent, trained automatic video cropping. In CVPR.Google Scholar
8. Dong, W., Zhou, N., Paul, J.-C., and Zhang, X. 2009. Optimized image resizing using seam carving and scaling. ACM Trans. Graph. 28, 5, 1–10. Google ScholarDigital Library
9. Fan, X., Xie, X., Zhou, H.-Q., and Ma, W.-Y. 2003. Looking into video frames on small displays. In Multimedia ’03, 247–250. Google ScholarDigital Library
10. Gal, R., Sorkine, O., and Cohen-Or, D. 2006. Featureaware texturing. In EGSR ’06, 297–303. Google ScholarDigital Library
11. Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 11, 1254–1259. Google ScholarDigital Library
12. Karni, Z., Freedman, D., and Gotsman, C. 2009. Energy-based image deformation. Comput. Graph. Forum 28, 5, 1257–1268. Google ScholarDigital Library
13. Krähenbühl, P., Lang, M., Hornung, A., and Gross, M. 2009. A system for retargeting of streaming video. ACM Trans. Graph. 28, 5. Google ScholarDigital Library
14. Liu, F., and Gleicher, M. 2006. Video retargeting: automating pan and scan. In Multimedia ’06, 241–250. Google ScholarDigital Library
15. Liu, H., Xie, X., Ma, W.-Y., and Zhang, H.-J. 2003. Automatic browsing of large pictures on mobile devices. In Proceedings of ACM International Conference on Multimedia, 148–155. Google ScholarDigital Library
16. Pritch, Y., Kav-Venaki, E., and Peleg, S. 2009. Shift-map image editing. In ICCV’09.Google Scholar
17. Rasheed, Z., and Shah, M. 2003. Scene detection in Hollywood movies and TV shows. In CVPR ’03, vol. 2, II–343–8.Google Scholar
18. Rubinstein, M., Shamir, A., and Avidan, S. 2008. Improved seam carving for video retargeting. ACM Trans. Graph. 27, 3. Google ScholarDigital Library
19. Rubinstein, M., Shamir, A., and Avidan, S. 2009. Multioperator media retargeting. ACM Trans. Graph. 28, 3, 23. Google ScholarDigital Library
20. Santella, A., Agrawala, M., DeCarlo, D., Salesin, D., and Cohen, M. 2006. Gaze-based interaction for semiautomatic photo cropping. In Proceedings of CHI, 771–780. Google ScholarDigital Library
21. Shamir, A., and Sorkine, O. 2009. Visual media retargeting. In ACM SIGGRAPH Asia Courses. Google ScholarDigital Library
22. Simakov, D., Caspi, Y., Shechtman, E., and Irani, M. 2008. Summarizing visual data using bidirectional similarity. In CVPR ’08.Google Scholar
23. Suh, B., Ling, H., Bederson, B. B., and Jacobs, D. W. 2003. Automatic thumbnail cropping and its effectiveness. In Proceedings of UIST, 95–104. Google ScholarDigital Library
24. Viola, P., and Jones, M. J. 2004. Robust real-time face detection. Int. J. Comput. Vision 57, 2, 137–154. Google ScholarDigital Library
25. Wang, Y.-S., Tai, C.-L., Sorkine, O., and Lee, T.-Y. 2008. Optimized scale-and-stretch for image resizing. ACM Trans. Graph. 27, 5, 118. Google ScholarDigital Library
26. Wang, Y.-S., Fu, H., Sorkine, O., Lee, T.-Y., and Seidel, H.-P. 2009. Motion-aware temporal coherence for video resizing. ACM Trans. Graph. 28, 5. Google ScholarDigital Library
27. Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., and Bischof, H. 2009. Anisotropic Huber-L1 optical flow. In Proceedings of the British Machine Vision Conference (BMVC).Google Scholar
28. Wolf, L., Guttmann, M., and Cohen-Or, D. 2007. Nonhomogeneous content-driven video-retargeting. In ICCV ’07.Google Scholar
29. Zhang, Y.-F., Hu, S.-M., and Martin, R. R. 2008. Shrinkability maps for content-aware video resizing. In PG ’08.Google Scholar
30. Zhang, G.-X., Cheng, M.-M., Hu, S.-M., and Martin, R. R. 2009. A shape-preserving approach to image resizing. Computer Graphics Forum 28, 7, 1897–1906.Google ScholarCross Ref

ACM Digital Library Publication: