“Motion-based video retargeting with optimized crop-and-warp” by Wang, Lin, Sorkine-Hornung and Lee

  • ©Yu-Shuen Wang, Hui-Chih Lin, Olga Sorkine-Hornung, and Tong-Yee Lee




    Motion-based video retargeting with optimized crop-and-warp



    We introduce a video retargeting method that achieves high-quality resizing to arbitrary aspect ratios for complex videos containing diverse camera and dynamic motions. Previous content-aware retargeting methods mostly concentrated on spatial considerations, attempting to preserve the shape of salient objects in each frame by removing or distorting homogeneous background content. However, sacrificeable space is fundamentally limited in video, since object motion makes foreground and background regions correlated, causing waving and squeezing artifacts. We solve the retargeting problem by explicitly employing motion information and by distributing distortion in both spatial and temporal dimensions. We combine novel cropping and warping operators, where the cropping removes temporally-recurring contents and the warping utilizes available homogeneous regions to mask deformations while preserving motion. Variational optimization allows to find the best balance between the two operations, enabling retargeting of challenging videos with complex motions, numerous prominent objects and arbitrary depth variability. Our method compares favorably with state-of-the-art retargeting systems, as demonstrated in the examples and widely supported by the conducted user study.


    1. Avidan, S., and Shamir, A. 2007. Seam carving for contentaware image resizing. ACM Trans. Graph. 26, 3, 10. Google ScholarDigital Library
    2. Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. B. 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 3. Google ScholarDigital Library
    3. Buatois, L., Caumon, G., and Lévy, B. 2009. Concurrent number cruncher: a GPU implementation of a general sparse linear solver. Int. J. Parallel Emerg. Distrib. Syst. 24, 3, 205–223. Google ScholarDigital Library
    4. Chen, L. Q., Xie, X., Fan, X., Ma, W. Y., Zhang, H. J., and Zhou, H. Q. 2003. A visual attention model for adapting images on small displays. ACM Multimedia Systems Journal 9, 4, 353–364.Google ScholarDigital Library
    5. Cho, T. S., Butman, M., Avidan, S., and Freeman, W. T. 2008. The patch transform and its applications to image editing. In CVPR ’08.Google Scholar
    6. David, H. A. 1963. The Method of Paired Comparisons. Charles Griffin & Company.Google Scholar
    7. Deselaers, T., Dreuw, P., and Ney, H. 2008. Pan, zoom, scan: Time-coherent, trained automatic video cropping. In CVPR.Google Scholar
    8. Dong, W., Zhou, N., Paul, J.-C., and Zhang, X. 2009. Optimized image resizing using seam carving and scaling. ACM Trans. Graph. 28, 5, 1–10. Google ScholarDigital Library
    9. Fan, X., Xie, X., Zhou, H.-Q., and Ma, W.-Y. 2003. Looking into video frames on small displays. In Multimedia ’03, 247–250. Google ScholarDigital Library
    10. Gal, R., Sorkine, O., and Cohen-Or, D. 2006. Featureaware texturing. In EGSR ’06, 297–303. Google ScholarDigital Library
    11. Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 11, 1254–1259. Google ScholarDigital Library
    12. Karni, Z., Freedman, D., and Gotsman, C. 2009. Energy-based image deformation. Comput. Graph. Forum 28, 5, 1257–1268. Google ScholarDigital Library
    13. Krähenbühl, P., Lang, M., Hornung, A., and Gross, M. 2009. A system for retargeting of streaming video. ACM Trans. Graph. 28, 5. Google ScholarDigital Library
    14. Liu, F., and Gleicher, M. 2006. Video retargeting: automating pan and scan. In Multimedia ’06, 241–250. Google ScholarDigital Library
    15. Liu, H., Xie, X., Ma, W.-Y., and Zhang, H.-J. 2003. Automatic browsing of large pictures on mobile devices. In Proceedings of ACM International Conference on Multimedia, 148–155. Google ScholarDigital Library
    16. Pritch, Y., Kav-Venaki, E., and Peleg, S. 2009. Shift-map image editing. In ICCV’09.Google Scholar
    17. Rasheed, Z., and Shah, M. 2003. Scene detection in Hollywood movies and TV shows. In CVPR ’03, vol. 2, II–343–8.Google Scholar
    18. Rubinstein, M., Shamir, A., and Avidan, S. 2008. Improved seam carving for video retargeting. ACM Trans. Graph. 27, 3. Google ScholarDigital Library
    19. Rubinstein, M., Shamir, A., and Avidan, S. 2009. Multioperator media retargeting. ACM Trans. Graph. 28, 3, 23. Google ScholarDigital Library
    20. Santella, A., Agrawala, M., DeCarlo, D., Salesin, D., and Cohen, M. 2006. Gaze-based interaction for semiautomatic photo cropping. In Proceedings of CHI, 771–780. Google ScholarDigital Library
    21. Shamir, A., and Sorkine, O. 2009. Visual media retargeting. In ACM SIGGRAPH Asia Courses. Google ScholarDigital Library
    22. Simakov, D., Caspi, Y., Shechtman, E., and Irani, M. 2008. Summarizing visual data using bidirectional similarity. In CVPR ’08.Google Scholar
    23. Suh, B., Ling, H., Bederson, B. B., and Jacobs, D. W. 2003. Automatic thumbnail cropping and its effectiveness. In Proceedings of UIST, 95–104. Google ScholarDigital Library
    24. Viola, P., and Jones, M. J. 2004. Robust real-time face detection. Int. J. Comput. Vision 57, 2, 137–154. Google ScholarDigital Library
    25. Wang, Y.-S., Tai, C.-L., Sorkine, O., and Lee, T.-Y. 2008. Optimized scale-and-stretch for image resizing. ACM Trans. Graph. 27, 5, 118. Google ScholarDigital Library
    26. Wang, Y.-S., Fu, H., Sorkine, O., Lee, T.-Y., and Seidel, H.-P. 2009. Motion-aware temporal coherence for video resizing. ACM Trans. Graph. 28, 5. Google ScholarDigital Library
    27. Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., and Bischof, H. 2009. Anisotropic Huber-L1 optical flow. In Proceedings of the British Machine Vision Conference (BMVC).Google Scholar
    28. Wolf, L., Guttmann, M., and Cohen-Or, D. 2007. Nonhomogeneous content-driven video-retargeting. In ICCV ’07.Google Scholar
    29. Zhang, Y.-F., Hu, S.-M., and Martin, R. R. 2008. Shrinkability maps for content-aware video resizing. In PG ’08.Google Scholar
    30. Zhang, G.-X., Cheng, M.-M., Hu, S.-M., and Martin, R. R. 2009. A shape-preserving approach to image resizing. Computer Graphics Forum 28, 7, 1897–1906.Google ScholarCross Ref

ACM Digital Library Publication:

Overview Page: