“Unwrap mosaics: a new representation for video editing” by Rav-Acha, Kohli, Rother and Fitzgibbon

  • ©Alex Rav-Acha, Pushmeet Kohli, Carsten Rother, and Andrew Fitzgibbon




    Unwrap mosaics: a new representation for video editing



    We introduce a new representation for video which facilitates a number of common editing tasks. The representation has some of the power of a full reconstruction of 3D surface models from video, but is designed to be easy to recover from a priori unseen and uncalibrated footage. By modelling the image-formation process as a 2D-to-2D transformation from an object’s texture map to the image, modulated by an object-space occlusion mask, we can recover a representation which we term the “unwrap mosaic”. Many editing operations can be performed on the unwrap mosaic, and then re-composited into the original sequence, for example resizing objects, repainting textures, copying/cutting/pasting objects, and attaching effects layers to deforming objects.


    1. 2d3 Ltd., 2008. Boujou 4: The virtual interchangeable with the real. http://www.2d3.com.Google Scholar
    2. Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S. M., Colburn, A., Curless, B., Salesin, D., and Cohen, M. F. 2004. Interactive digital photomontage. ACM Trans. Graph. (Proc. of SIGGRAPH) 23, 3, 294–302. Google ScholarDigital Library
    3. Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M., and Szeliski, R. 2007. A database and evaluation methodology for optical flow. In Proc. ICCV.Google Scholar
    4. Bhat, P., Zitnick, C. L., Snavely, N., Agarwala, A., Agrawala, M., Cohen, M., Curless, B., and Kang, S. B. 2007. Using photographs to enhance videos of a static scene. In Eurographics Symposium on Rendering. Google ScholarDigital Library
    5. Black, M. J., and Anandan, P. 1993. A framework for the robust estimation of optical flow. In Proc. ICCV, 231–236.Google Scholar
    6. Blake, A., and Zisserman, A. 1987. Visual Reconstruction. MIT Press. Google ScholarDigital Library
    7. Boykov, Y., and Jolly, M.-P. 2001. Interactive graph cuts for optimal boundary and region segmentation of objects in n-D images. In Proc. ICCV, 105–112.Google Scholar
    8. Brand, M. 2001. Morphable 3D models from video. In Proc. CVPR, vol. 2, 456–463.Google Scholar
    9. Bregler, C., Hertzmann, A., and Biermann, H. 2000. Recovering non-rigid 3D shape from image streams. In Proc. CVPR, 690–696.Google Scholar
    10. Brown, M., and Lowe, D. G. 2007. Automatic panoramic image stitching using invariant features. Intl. J. Comput. Vision 74, 1, 59–73. Google ScholarDigital Library
    11. Brox, T., Bruhn, A., Papenberg, N., and Weickert, J. 2004. High accuracy optical flow estimation based on a theory for warping. In Proc. ECCV, 25–36.Google Scholar
    12. Bruhn, A., Weickert, J., and Schnörr, C. 2005. Lucas/Kanade meets Horn/Schunck: Combining local and global optic flow methods. Intl. J. of Computer Vision 61, 3, 211–231. Google ScholarDigital Library
    13. Costeira, J. P., and Kanade, T. 1998. A multibody factorization method for independently moving objects. Intl. J. of Computer Vision 29, 3, 159–179. Google ScholarDigital Library
    14. Cox, M., and Cox, M. A. A. 2001. Multidimensional Scaling. Chapman and Hall.Google Scholar
    15. Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs. In Proc. ACM Siggraph. Google ScholarDigital Library
    16. Fleet, D., Jepson, A., and Black, M. 2002. A layered motion representation with occlusion and compact spatial support. In Proc. ECCV, 692–706. Google ScholarDigital Library
    17. Frey, B. J., Jojic, N., and Kannan, A. 2003. Learning appearance and transparency manifolds of occluded objects in layers. In Proc. CVPR. Google ScholarDigital Library
    18. Gay-Bellile, V., Bartoli, A., and Sayd, P. 2007. Direct estimation of non-rigid registrations with image-based self-occlusion reasoning. In Proc. ICCV.Google Scholar
    19. Gu, X., Gortler, S. J., and Hoppe, H. 2002. Geometry images. ACM Trans. Graph. (Proc. of SIGGRAPH), 355–361. Google ScholarDigital Library
    20. Irani, M., Anandan, P., and Hsu, S. 1995. Mosaic based representations of video sequences and their applications. In Proc. ICCV. Google ScholarDigital Library
    21. Lempitsky, V., and Ivanov, D. 2007. Seamless mosaicing of image-based texture maps. In Proc. CVPR, 1–6.Google Scholar
    22. Li, Y., Sun, J., and Shum, H.-Y. 2005. Video object cut and paste. ACM Trans. Graph. (Proc. of SIGGRAPH) 24, 3, 595–600. Google ScholarDigital Library
    23. Rav-Acha, A., Kohli, P., Rother, C., and Fitzgibbon, A. 2008. Unwrap mosaics. Tech. rep., Microsoft Research. http://research.microsoft.com/unwrap.Google Scholar
    24. Sand, P., and Teller, S. J. 2006. Particle video: Longrange motion estimation using point trajectories. In Proc. CVPR, 2195–2202. Google ScholarDigital Library
    25. Seetzen, H., Heidrich, W., Stuerzlinger, W., Ward, G., Whitehead, L., Trentacoste, M., Ghosh, A., and Vorozcovs, A. 2004. High dynamic range display systems. ACM Trans. Graph. (Proc. of SIGGRAPH) 23, 3, 760–768. Google ScholarDigital Library
    26. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proc. CVPR, vol. 1, 519–526. Google ScholarDigital Library
    27. Seymour, M. 2006. Art of optical flow. fxguide.com: Feature Stories (Dec.).Google Scholar
    28. Shade, J. W., Gortler, S. J., He, L.-W., and Szeliski, R. 1998. Layered depth images. In Proc. ACM Siggraph, 231–242. Google ScholarDigital Library
    29. Shi, J., and Malik, J. 1997. Normalized cuts and image segmentation. In Proc. CVPR, 731–743. Google ScholarDigital Library
    30. Thormählen, T., and Broszio, H., 2008. Voodoo Camera Tracker: A tool for the integration of virtual and real scenes. http://www.digilab.uni-hannover.de/docs/manual.html.Google Scholar
    31. Toklu, C., Erdem, A. T., and Tekalp, A. M. 2000. Two-dimensional mesh-based mosaic representation for manipulation of video objects with occlusion. IEEE Trans. Image Proc. 9, 9, 1617–1630. Google ScholarDigital Library
    32. Torresani, L., Hertzmann, A., and Bregler, C. 2008. Non-rigid structure-from-motion: Estimating shape and motion with hierarchical priors. IEEE Trans. PAMI, (to appear). Google ScholarDigital Library
    33. Turk, G., and Levoy, M. 1994. Zippered polygon meshes from range images. In Proc. ACM Siggraph, 311–318. Google ScholarDigital Library
    34. van den Hengel, A., Dick, A., Thormählen, T., Ward, B., and Torr, P. H. S. 2007. VideoTrace: Rapid interactive scene modelling from video. ACM Trans. Graph. (Proc. of SIGGRAPH). Google ScholarDigital Library
    35. Wang, J. Y. A., and Adelson, E. H. 1994. Representing moving images with layers. IEEE Trans. Image Proc. 3, 5, 625–638.Google ScholarDigital Library
    36. Woodford, O. J., Reid, I. D., and Fitzgibbon, A. W. 2007. Efficient new-view synthesis using pairwise dictionary priors. In Proc. CVPR.Google Scholar
    37. Zhou, K., Wang, X., Tong, Y., Desbrun, M., Guo, B., and Shum, H.-Y. 2005. Texture-Montage: Seamless texturing of surfaces from multiple images. ACM Trans. Graph. (Proc. of SIGGRAPH), 1148–1155. Google ScholarDigital Library
    38. Zigelman, G., Kimmel, R., and Kiryati, N. 2002. Texture mapping using surface flattening via multi-dimensional scaling. IEEE Trans. on Visualization and Computer Graphics 8, 2, 198–207. Google ScholarDigital Library

ACM Digital Library Publication: