Unwrap mosaics: a new representation for video editing

Alex Rav-Acha; Pushmeet Kohli; Carsten Rother; Andrew Fitzgibbon

“Unwrap mosaics: a new representation for video editing” by Rav-Acha, Kohli, Rother and Fitzgibbon

Next: “Up Close with Simulated Crowds” by Adams... »

« Previous: “Untold HERstories: an Homage to SIGGRAPH”...

Conference:

SIGGRAPH 2008

Type(s):

Technical Papers

Title:

Unwrap mosaics: a new representation for video editing

Presenter(s)/Author(s):

Alex Rav-Acha

Pushmeet Kohli

Carsten Rother

Andrew Fitzgibbon

Abstract:

We introduce a new representation for video which facilitates a number of common editing tasks. The representation has some of the power of a full reconstruction of 3D surface models from video, but is designed to be easy to recover from a priori unseen and uncalibrated footage. By modelling the image-formation process as a 2D-to-2D transformation from an object’s texture map to the image, modulated by an object-space occlusion mask, we can recover a representation which we term the “unwrap mosaic”. Many editing operations can be performed on the unwrap mosaic, and then re-composited into the original sequence, for example resizing objects, repainting textures, copying/cutting/pasting objects, and attaching effects layers to deforming objects.

References:

1. 2d3 Ltd., 2008. Boujou 4: The virtual interchangeable with the real. http://www.2d3.com.Google Scholar
2. Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S. M., Colburn, A., Curless, B., Salesin, D., and Cohen, M. F. 2004. Interactive digital photomontage. ACM Trans. Graph. (Proc. of SIGGRAPH) 23, 3, 294–302. Google ScholarDigital Library
3. Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M., and Szeliski, R. 2007. A database and evaluation methodology for optical flow. In Proc. ICCV.Google Scholar
4. Bhat, P., Zitnick, C. L., Snavely, N., Agarwala, A., Agrawala, M., Cohen, M., Curless, B., and Kang, S. B. 2007. Using photographs to enhance videos of a static scene. In Eurographics Symposium on Rendering. Google ScholarDigital Library
5. Black, M. J., and Anandan, P. 1993. A framework for the robust estimation of optical flow. In Proc. ICCV, 231–236.Google Scholar
6. Blake, A., and Zisserman, A. 1987. Visual Reconstruction. MIT Press. Google ScholarDigital Library
7. Boykov, Y., and Jolly, M.-P. 2001. Interactive graph cuts for optimal boundary and region segmentation of objects in n-D images. In Proc. ICCV, 105–112.Google Scholar
8. Brand, M. 2001. Morphable 3D models from video. In Proc. CVPR, vol. 2, 456–463.Google Scholar
9. Bregler, C., Hertzmann, A., and Biermann, H. 2000. Recovering non-rigid 3D shape from image streams. In Proc. CVPR, 690–696.Google Scholar
10. Brown, M., and Lowe, D. G. 2007. Automatic panoramic image stitching using invariant features. Intl. J. Comput. Vision 74, 1, 59–73. Google ScholarDigital Library
11. Brox, T., Bruhn, A., Papenberg, N., and Weickert, J. 2004. High accuracy optical flow estimation based on a theory for warping. In Proc. ECCV, 25–36.Google Scholar
12. Bruhn, A., Weickert, J., and Schnörr, C. 2005. Lucas/Kanade meets Horn/Schunck: Combining local and global optic flow methods. Intl. J. of Computer Vision 61, 3, 211–231. Google ScholarDigital Library
13. Costeira, J. P., and Kanade, T. 1998. A multibody factorization method for independently moving objects. Intl. J. of Computer Vision 29, 3, 159–179. Google ScholarDigital Library
14. Cox, M., and Cox, M. A. A. 2001. Multidimensional Scaling. Chapman and Hall.Google Scholar
15. Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs. In Proc. ACM Siggraph. Google ScholarDigital Library
16. Fleet, D., Jepson, A., and Black, M. 2002. A layered motion representation with occlusion and compact spatial support. In Proc. ECCV, 692–706. Google ScholarDigital Library
17. Frey, B. J., Jojic, N., and Kannan, A. 2003. Learning appearance and transparency manifolds of occluded objects in layers. In Proc. CVPR. Google ScholarDigital Library
18. Gay-Bellile, V., Bartoli, A., and Sayd, P. 2007. Direct estimation of non-rigid registrations with image-based self-occlusion reasoning. In Proc. ICCV.Google Scholar
19. Gu, X., Gortler, S. J., and Hoppe, H. 2002. Geometry images. ACM Trans. Graph. (Proc. of SIGGRAPH), 355–361. Google ScholarDigital Library
20. Irani, M., Anandan, P., and Hsu, S. 1995. Mosaic based representations of video sequences and their applications. In Proc. ICCV. Google ScholarDigital Library
21. Lempitsky, V., and Ivanov, D. 2007. Seamless mosaicing of image-based texture maps. In Proc. CVPR, 1–6.Google Scholar
22. Li, Y., Sun, J., and Shum, H.-Y. 2005. Video object cut and paste. ACM Trans. Graph. (Proc. of SIGGRAPH) 24, 3, 595–600. Google ScholarDigital Library
23. Rav-Acha, A., Kohli, P., Rother, C., and Fitzgibbon, A. 2008. Unwrap mosaics. Tech. rep., Microsoft Research. http://research.microsoft.com/unwrap.Google Scholar
24. Sand, P., and Teller, S. J. 2006. Particle video: Longrange motion estimation using point trajectories. In Proc. CVPR, 2195–2202. Google ScholarDigital Library
25. Seetzen, H., Heidrich, W., Stuerzlinger, W., Ward, G., Whitehead, L., Trentacoste, M., Ghosh, A., and Vorozcovs, A. 2004. High dynamic range display systems. ACM Trans. Graph. (Proc. of SIGGRAPH) 23, 3, 760–768. Google ScholarDigital Library
26. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proc. CVPR, vol. 1, 519–526. Google ScholarDigital Library
27. Seymour, M. 2006. Art of optical flow. fxguide.com: Feature Stories (Dec.).Google Scholar
28. Shade, J. W., Gortler, S. J., He, L.-W., and Szeliski, R. 1998. Layered depth images. In Proc. ACM Siggraph, 231–242. Google ScholarDigital Library
29. Shi, J., and Malik, J. 1997. Normalized cuts and image segmentation. In Proc. CVPR, 731–743. Google ScholarDigital Library
30. Thormählen, T., and Broszio, H., 2008. Voodoo Camera Tracker: A tool for the integration of virtual and real scenes. http://www.digilab.uni-hannover.de/docs/manual.html.Google Scholar
31. Toklu, C., Erdem, A. T., and Tekalp, A. M. 2000. Two-dimensional mesh-based mosaic representation for manipulation of video objects with occlusion. IEEE Trans. Image Proc. 9, 9, 1617–1630. Google ScholarDigital Library
32. Torresani, L., Hertzmann, A., and Bregler, C. 2008. Non-rigid structure-from-motion: Estimating shape and motion with hierarchical priors. IEEE Trans. PAMI, (to appear). Google ScholarDigital Library
33. Turk, G., and Levoy, M. 1994. Zippered polygon meshes from range images. In Proc. ACM Siggraph, 311–318. Google ScholarDigital Library
34. van den Hengel, A., Dick, A., Thormählen, T., Ward, B., and Torr, P. H. S. 2007. VideoTrace: Rapid interactive scene modelling from video. ACM Trans. Graph. (Proc. of SIGGRAPH). Google ScholarDigital Library
35. Wang, J. Y. A., and Adelson, E. H. 1994. Representing moving images with layers. IEEE Trans. Image Proc. 3, 5, 625–638.Google ScholarDigital Library
36. Woodford, O. J., Reid, I. D., and Fitzgibbon, A. W. 2007. Efficient new-view synthesis using pairwise dictionary priors. In Proc. CVPR.Google Scholar
37. Zhou, K., Wang, X., Tong, Y., Desbrun, M., Guo, B., and Shum, H.-Y. 2005. Texture-Montage: Seamless texturing of surfaces from multiple images. ACM Trans. Graph. (Proc. of SIGGRAPH), 1148–1155. Google ScholarDigital Library
38. Zigelman, G., Kimmel, R., and Kiryati, N. 2002. Texture mapping using surface flattening via multi-dimensional scaling. IEEE Trans. on Visualization and Computer Graphics 8, 2, 198–207. Google ScholarDigital Library

ACM Digital Library Publication: