“Content-preserving warps for 3D video stabilization” by Liu, Gleicher, Jin and Agarwala

  • ©Feng Liu, Michael Gleicher, Hailin Jin, and Aseem Agarwala




    Content-preserving warps for 3D video stabilization



    We describe a technique that transforms a video from a hand-held video camera so that it appears as if it were taken with a directed camera motion. Our method adjusts the video to appear as if it were taken from nearby viewpoints, allowing 3D camera movements to be simulated. By aiming only for perceptual plausibility, rather than accurate reconstruction, we are able to develop algorithms that can effectively recreate dynamic scenes from a single source video. Our technique first recovers the original 3D camera motion and a sparse set of 3D, static scene points using an off-the-shelf structure-from-motion system. Then, a desired camera path is computed either automatically (e.g., by fitting a linear or quadratic path) or interactively. Finally, our technique performs a least-squares optimization that computes a spatially-varying warp from each input video frame into an output frame. The warp is computed to both follow the sparse displacements suggested by the recovered 3D structure, and avoid deforming the content in the video frame. Our experiments on stabilizing challenging videos of dynamic scenes demonstrate the effectiveness of our technique.


    1. Alexa, M., Cohen-Or, D., and Levin, D. 2000. As-rigid-as-possible shape interpolation. In Proceedings of ACM SIGGRAPH 2000, Computer Graphics Proceedings, Annual Conference Series, 157–164. Google ScholarDigital Library
    2. Avidan, S., and Shamir, A. 2007. Seam carving for content-aware image resizing. ACM Transactions on Graphics 26, 3 (July), 10:1–10:9. Google ScholarDigital Library
    3. Beier, T., and Neely, S. 1992. Feature-based image metamorphosis. In Computer Graphics (Proceedings of SIGGRAPH 92), 35–42. Google ScholarDigital Library
    4. Bhat, P., Zitnick, C. L., Snavely, N., Agarwala, A., Agrawala, M., Cohen, M., Curless, B., and Kang, S. B. 2007. Using photographs to enhance videos of a static scene. In Rendering Techniques 2007: 18th Eurographics Workshop on Rendering, 327–338. Google ScholarDigital Library
    5. Bookstein, F. L. 1989. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell. 11, 6, 567–585. Google ScholarDigital Library
    6. Buehler, C., Bosse, M., and McMillan, L. 2001. Nonmetric image-based rendering for video stabilization. In 2001 Conference on Computer Vision and Pattern Recognition (CVPR 2001), 609–614.Google Scholar
    7. Buehler, C., Bosse, M., McMillan, L., Gortler, S. J., and Cohen, M. F. 2001. Unstructured lumigraph rendering. In Proceedings of ACM SIGGRAPH 2001, Computer Graphics Proceedings, Annual Conference Series, 425–432. Google ScholarDigital Library
    8. Chuang, Y.-Y., Agarwala, A., Curless, B., Salesin, D. H., and Szeliski, R. 2002. Video matting of complex scenes. ACM Transactions on Graphics 21, 3 (July), 243–248. Google ScholarDigital Library
    9. Fitzgibbon, A., Wexler, Y., and Zisserman, A. 2005. Image-based rendering using image-based priors. International Journal of Computer Vision 63, 2 (July), 141–151. Google ScholarDigital Library
    10. Gal, R., Sorkine, O., and Cohen-Or, D. 2006. Feature-aware texturing. In Rendering Techniques 2006: 17th Eurographics Workshop on Rendering, 297–304. Google ScholarDigital Library
    11. Gleicher, M. L., and Liu, F. 2008. Re-cinematography: Improving the camerawork of casual video. ACM Transactions on Multimed. 5, 1, 1–28. Google ScholarDigital Library
    12. Gleicher, M., and Witkin, A. 1992. Through-the-lens camera control. In Computer Graphics (Proceedings of SIGGRAPH 92), 331–340. Google ScholarDigital Library
    13. Gomes, J., Darsa, L., Costa, B., and Velho, L. 1998. Warping and morphing of graphical objects. Morgan Kaufmann Publishers Inc., San Francisco, CA. Google ScholarDigital Library
    14. Hartley, R. I., and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press. Google ScholarDigital Library
    15. Heckbert, P. S. 1989. Fundamentals of texture mapping and image warping. Tech. Rep. UCB/CSD-89-516, EECS Department, University of California, Berkeley, Jun. Google ScholarDigital Library
    16. Hoiem, D., Efros, A. A., and Hebert, M. 2005. Automatic photo pop-up. ACM Transactions on Graphics 24, 3 (Aug.), 577–584. Google ScholarDigital Library
    17. Igarashi, T., Moscovich, T., and Hughes, J. F. 2005. Asrigid-as-possible shape manipulation. ACM Transactions on Graphics 24, 3 (Aug.), 1134–1141. Google ScholarDigital Library
    18. Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 11 (Nov), 1254–1259. Google ScholarDigital Library
    19. Kawin, B. 1992. How Movies Work. Univ. of California Press.Google Scholar
    20. Lee, J., and Shin, S. Y. 2002. General construction of time-domain filters for orientation data. IEEE Transactions on Visualization and Computer Graphics 8, 2 (April-June), 119–128. Google ScholarDigital Library
    21. Matsushita, Y., Ofek, E., Ge, W., Tang, X., and Shum, H.-Y. 2006. Full-frame video stabilization with motion inpainting. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 7, 1150–1163. Google ScholarDigital Library
    22. Meingast, M., Geyer, C., and Sastry, S. 2005. Geometric models of rolling-shutter cameras. In 6th Int. workshop on Omnidirectional vision, Camera networks, and non-classical cameras.Google Scholar
    23. Morimoto, C., and Chellappa, R. 1997. Evaluation of image stabilization algorithms. In DARPA Image Understanding Workshop DARPA97, 295–302.Google Scholar
    24. Murray, R. M., Sastry, S. S., and Zexiang, L. 1994. A Mathematical Introduction to Robotic Manipulation. CRC Press, Inc., Boca Raton, FL, USA. Google ScholarDigital Library
    25. Nister, D. 2003. Preemptive RANSAC for live structure and motion estimation. IEEE International Conference on Computer Vision 1, 199–206. Google ScholarDigital Library
    26. Rubinstein, M., Shamir, A., and Avidan, S. 2008. Improved seam carving for video retargeting. ACM Transactions on Graphics 27, 3 (Aug.), 16:1–16:9. Google ScholarDigital Library
    27. Schaefer, S., McPhail, T., and Warren, J. 2006. Image deformation using moving least squares. ACM Transactions on Graphics 25, 3 (July), 533–540. Google ScholarDigital Library
    28. Thormählen, T., and Seidel, H.-P. 2008. 3D-modeling by ortho-image generation from image sequences. ACM Transactions on Graphics 27, 3 (Aug.), 86:1–86:5. Google ScholarDigital Library
    29. Torr, P. H. S., Fitzgibbon, A. W., and Zisserman, A. 1999. The problem of degeneracy in structure and motion recovery from uncalibrated image sequences. International Journal of Computer Vision 32, 1, 27–44. Google ScholarDigital Library
    30. Torresani, L., Hertzmann, A., and Bregler, C. 2008. Nonrigid structure-from-motion: Estimating shape and motion with hierarchical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 5, 878–892. Google ScholarDigital Library
    31. van den Hengel, A., Dick, A., Thormählen, T., Ward, B., and Torr, P. H. S. 2007. Videotrace: Rapid interactive scene modelling from video. ACM Transactions on Graphics 26, 3 (July), 86:1–86:5. Google ScholarDigital Library
    32. Wang, Y.-S., Tai, C.-L., Sorkine, O., and Lee, T.-Y. 2008. Optimized scale-and-stretch for image resizing. ACM Transactions on Graphics 27, 5 (Dec.), 118:1–118:8. Google ScholarDigital Library
    33. Wexler, Y., Shechtman, E., and Irani, M. 2004. Space-time video completion. In 2004 Conference on Computer Vision and Pattern Recognition (CVPR 2004), 120–127.Google Scholar
    34. Wolf, L., Guttmann, M., and Cohen-Or, D. 2007. Nonhomogeneous content-driven video-retargeting. In IEEE International Conference on Computer Vision, 1–6.Google Scholar

ACM Digital Library Publication: