Scalable and coherent video resizing with per-frame optimization

Yu-Shuen Wang; Jen-Hung Hsiao; Olga Sorkine-Hornung; Tong-Yee Lee

“Scalable and coherent video resizing with per-frame optimization” by Wang, Hsiao, Sorkine-Hornung and Lee

Next: “Scalable freeform deformation” by... »

« Previous: “SCAAT: incremental tracking with incomplete...

Conference:

SIGGRAPH 2011

Type(s):

Technical Papers

Title:

Scalable and coherent video resizing with per-frame optimization

Presenter(s)/Author(s):

Yu-Shuen Wang

Jen-Hung Hsiao

Olga Sorkine-Hornung

Tong-Yee Lee

Abstract:

The key to high-quality video resizing is preserving the shape and motion of visually salient objects while remaining temporally-coherent. These spatial and temporal requirements are difficult to reconcile, typically leading existing video retargeting methods to sacrifice one of them and causing distortion or waving artifacts. Recent work enforces temporal coherence of content-aware video warping by solving a global optimization problem over the entire video cube. This significantly improves the results but does not scale well with the resolution and length of the input video and quickly becomes intractable. We propose a new method that solves the scalability problem without compromising the resizing quality. Our method factors the problem into spatial and time/motion components: we first resize each frame independently to preserve the shape of salient regions, and then we optimize their motion using a reduced model for each pathline of the optical flow. This factorization decomposes the optimization of the video cube into sets of sub-problems whose size is proportional to a single frame’s resolution and which can be solved in parallel. We also show how to incorporate cropping into our optimization, which is useful for scenes with numerous salient objects where warping alone would degenerate to linear scaling. Our results match the quality of state-of-the-art retargeting methods while dramatically reducing the computation time and memory consumption, making content-aware video resizing scalable and practical.

References:

1. Avidan, S., and Shamir, A. 2007. Seam carving for content-aware image resizing. ACM Trans. Graph. 26, 3. Google ScholarDigital Library
2. Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. B. 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 3. Google ScholarDigital Library
3. Chen, L. Q., Xie, X., Fan, X., Ma, W. Y., Zhang, H. J., and Zhou, H. Q. 2003. A visual attention model for adapting images on small displays. ACM Multimedia Systems Journal 9, 4, 353–364.Google ScholarDigital Library
4. Cho, T. S., Butman, M., Avidan, S., and Freeman, W. T. 2008. The patch transform and its applications to image editing. In Proc. CVPR ’08.Google Scholar
5. Deselaers, T., Dreuw, P., and Ney, H. 2008. Pan, zoom, scan — time-coherent, trained automatic video cropping. In CVPR ’08.Google Scholar
6. Dong, W., Zhou, N., Paul, J.-C., and Zhang, X. 2009. Optimized image resizing using seam carving and scaling. ACM Trans. Graph. 28, 5. Google ScholarDigital Library
7. Gal, R., Sorkine, O., and Cohen-Or, D. 2006. Feature-aware texturing. In Proc. EGSR ’06, 297–303. Google Scholar
8. Gleicher, M. L., and Liu, F. 2008. Re-cinematography: Improving the camerawork of casual video. ACM Trans. Multimedia Comput. Commun. Appl. 5, 1, 1–28. Google ScholarDigital Library
9. Karni, Z., Freedman, D., and Gotsman, C. 2009. Energy-based image deformation. Comput. Graph. Forum 28, 5, 1257–1268. Google ScholarDigital Library
10. Krähenbühl, P., Lang, M., Hornung, A., and Gross, M. 2009. A system for retargeting of streaming video. ACM Trans. Graph. 28, 5. Google ScholarDigital Library
11. Liu, F., and Gleicher, M. 2006. Video retargeting: automating pan and scan. In Proc. Multimedia ’06, 241–250. Google Scholar
12. Liu, H., Xie, X., Ma, W.-Y., and Zhang, H.-J. 2003. Automatic browsing of large pictures on mobile devices. In Proc. ACM International Conference on Multimedia, 148–155. Google Scholar
13. Niu, Y., Liu, F., Li, X., and Gleicher, M. 2010. Warp propagation for video resizing. In Proc. CVPR, 537–544.Google Scholar
14. Pritch, Y., Kav-Venaki, E., and Peleg, S. 2009. Shift-map image editing. In Proc. ICCV’09.Google Scholar
15. Rasheed, Z., and Shah, M. 2003. Scene detection in Hollywood movies and TV shows. In Proc. CVPR, II-343-8.Google Scholar
16. Rubinstein, M., Shamir, A., and Avidan, S. 2008. Improved seam carving for video retargeting. ACM Trans. Graph. 27, 3. Google ScholarDigital Library
17. Rubinstein, M., Shamir, A., and Avidan, S. 2009. Multi-operator media retargeting. ACM Trans. Graph. 28, 3, 23. Google ScholarDigital Library
18. Rubinstein, M., Gutierrez, D., Sorkine, O., and Shamir, A. 2010. A comparative study of image retargeting. ACM Trans. Graph. 29, 5. Google ScholarDigital Library
19. Santella, A., Agrawala, M., DeCarlo, D., Salesin, D., and Cohen, M. 2006. Gaze-based interaction for semiautomatic photo cropping. In Proc. CHI, 771–780. Google Scholar
20. Shamir, A., and Sorkine, O. 2009. Visual media retargeting. In ACM SIGGRAPH Asia Courses. Google Scholar
21. Simakov, D., Caspi, Y., Shechtman, E., and Irani, M. 2008. Summarizing visual data using bidirectional similarity. In Proc. CVPR ’08.Google Scholar
22. Suh, B., Ling, H., Bederson, B. B., and Jacobs, D. W. 2003. Automatic thumbnail cropping and its effectiveness. In Proc. UIST, 95–104. Google Scholar
23. Wang, Y.-S., Tai, C.-L., Sorkine, O., and Lee, T.-Y. 2008. Optimized scale-and-stretch for image resizing. ACM Trans. Graph. 27, 5, 118. Google ScholarDigital Library
24. Wang, Y.-S., Fu, H., Sorkine, O., Lee, T.-Y., and Seidel, H.-P. 2009. Motion-aware temporal coherence for video resizing. ACM Trans. Graph. 28, 5. Google ScholarDigital Library
25. Wang, Y.-S., Lin, H.-C., Sorkine, O., and Lee, T.-Y. 2010. Motion-based video retargeting with optimized crop-and-warp. ACM Trans. Graph. 29, 4, article no. 90. Google ScholarDigital Library
26. Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., and Bischof, H. 2009. Anisotropic Huber-L1 optical flow. In Proc. British Machine Vision Conference (BMVC).Google Scholar
27. Wolf, L., Guttmann, M., and Cohen-Or, D. 2007. Non-homogeneous content-driven video-retargeting. In ICCV ’07.Google Scholar
28. Wu, H., Wang, Y.-S., Feng, K.-C., Wong, T.-T., Lee, T.-Y., and Heng, P.-A. 2010. Resizing by symmetry-summarization. ACM Trans. Graph. 29, 6, 159:1–159:9. Google ScholarDigital Library
29. Zhang, Y.-F., Hu, S.-M., and Martin, R. R. 2008. Shrinkability maps for content-aware video resizing. In Proc. PG ’08.Google Scholar
30. Zhang, G.-X., Cheng, M.-M., Hu, S.-M., and Martin, R. R. 2009. A shape-preserving approach to image resizing. Comput. Graph. Forum 28, 7, 1897–1906.Google ScholarCross Ref

ACM Digital Library Publication: