“TrackCam: 3D-aware tracking shots from consumer video” by Liu, Wang, Cho and Tan
Conference:
Type(s):
Title:
- TrackCam: 3D-aware tracking shots from consumer video
Session/Category Title: Moving Pictures
Presenter(s)/Author(s):
Abstract:
Panning and tracking shots are popular photography techniques in which the camera tracks a moving object and keeps it at the same position, resulting in an image where the moving foreground is sharp but the background is blurred accordingly, creating an artistic illustration of the foreground motion. Such shots however are hard to capture even for professionals, especially when the foreground motion is complex (e.g., non-linear motion trajectories).In this work we propose a system to generate realistic, 3D-aware tracking shots from consumer videos. We show how computer vision techniques such as segmentation and structure-from-motion can be used to lower the barrier and help novice users create high quality tracking shots that are physically plausible. We also introduce a pseudo 3D approach for relative depth estimation to avoid expensive 3D reconstruction for improved robustness and a wider application range. We validate our system through extensive quantitative and qualitative evaluations.
References:
1. Agarwala, A., Dontcheva, M., Agrawala, M., S. Drucker, and A. Colburn. 2004. Interactive digital photomontage. ACM Transactions on Graphics (TOG) 23.
2. Avidan, S., and Shashua, A. 2000. Trajectory triangulation: 3d reconstruction of moving points from a monocular image sequence. IEEE Trans. Pattern Anal. Mach. Intell. 22, 4, 348–357.
3. Bai, X., Wang, J., Simons, D., and Sapiro, G. 2009. Video snapcut: Robust video object cutout using localized classifiers. ACM Trans. Graph. 28, 3.
4. Bhat, P., Zitnick, C. L., Snavely, N., Agarwala, A., Agrawala, M., Cohen, M., Curless, B., and Kang, S. B. 2007. Using photographs to enhance videos of a static scene. In Comput. Graph. Forum, Proc. EGSR, 327–338.
5. Bregler, C., Hertzman, A., and Biermann, H. 2000. Recovering non-rigid 3d shape from image streams. In CVPR.
6. Brostow, G., and Essa, I. 2001. Image-based motion blur for stop motion animation. ACM Transactions on Graphics (TOG).
7. Chaurasia, G., Duchene, S., Sorkine-Hornung, O., and Drettakis, G. 2013. Depth synthesis and local warps for plausible image-based navigation. ACM Transactions on Graphics (TOG) 32, 3, 30.
8. Cho, S., and Lee, S. 2009. Fast motion deblurring. ACM Trans. Graph. 28, 5.
9. Cho, S., Wang, J., and Lee, S. 2012. Vdeo deblurring for handheld cameras using patch-based synthesis. ACM Trans. Graph..
10. Faugeras, O., Luong, Q.-T., and Papadopoulou, T. 2001. The Geometry of Multiple Images: The Laws That Govern The Formation of Images of A Scene and Some of Their Applications. MIT Press, Cambridge, MA, USA.
11. Fergus, R., Singh, B., Hertzmann, A., Roweis, S. T., and Freeman, W. T. 2006. Removing camera shake from a single photograph. ACM Trans. Graph. 25, 3.
12. Hartley, R., and Zisserman, A. 2003. Multiple View Geometry in Computer Vision, 2 ed. Cambridge University Press, New York, NY, USA.
13. Igarashi, T., Moscovich, T., and Hughes, J. 2005. As-rigid-as-possible shape manipulation. ACM Transactions on Graphics (TOG) 24.
14. Kaminski, J. Y., and Teicher, M. 2004. A general framework for trajectory triangulation. Journal of Mathematical Imaging and Vision 21, 1-2, 27–41.
15. Kanade, T., and Okutomi, M. 1994. A stereo matching algorithm with an adaptive window: Theory and experiment. Pattern Analysis and Machine Intelligence 16, 920–932.
16. Karpenko, A., Jacobs, D., Baek, J., and Levoy, M. 2011. Digital video stabilization and rolling shutter correction using gyroscopes. Stanford University Computer Science Tech Report CSTR 2011–03.
17. Levin, A., Lischinski, D., and Weiss, Y. 2004. Colorization using optimization. ACM Trans. Graph. 23, 3, 689–694.
18. Lin, H.-Y., and Chang, C.-H. 2006. Photo-consistent motion blur modeling for realistic image synthesis. In Proc. PSIVT.
19. Lischinski, D., Farbman, Z., Uyttendaele, M., and Szeliski, R. 2006. Interactive local adjustment of tonal values. In ACM Transactions on Graphics (TOG), vol. 25, 646–653.
20. Liu, F., Gleicher, M., Jin, H., and Agarwala, A. 2009. Content-preserving warps for 3d video stabilization. ACM Transactions on Graphics (TOG) 28.
21. Liu, F., Gleicher, M., Wang, J., Jin, H., and Agarwala, A. 2011. Subspace video stabilization. ACM Transactions on Graphics (TOG) 30.
22. Liu, S., Wang, Y., Yuan, L., Bu, J., Tan, P., and Sun, J. 2012. Video stabilization with a depth camera. In CVPR.
23. Liu, S., Yuan, L., Tan, P., and Sun, J. 2013. Bundled camera paths for video stabilization. ACM Transactions on Graphics (TOG) 32.
24. Liu, S., Yuan, L., Tan, P., and Sun, J. 2014. Steadyflow: Spatially smooth optical flow for video stabilization. In CVPR.
25. Liu, C. 2009. Beyond pixels: Exploring new representations and applications for motion analysis. Doctoral Thesis. Massachusetts Institute of Technology.
26. Longuet-Higgins, H. 1981. A computer algorithm for reconstructing a scene from two projections. Nature 293, 133–135.Cross Ref
27. Matsushita, Y., Ofek, E., Tang, X., and Shum, H.-Y. 2005. Full-frame video stabilization. In CVPR.
28. Ozden, K. E., Cornelis, K., Van Eycken, L., and Van Gool, L. 2004. Reconstructing 3d trajectories of independently moving objects using generic constraints. Computer Vision and Image Understanding (CVIU) 96, 3, 453–471.
29. Park, H., T. Shiratori, I. M., and Sheikh, Y. 2010. 3d reconstruction of a moving point from a series of 2d projections. In ECCV.
30. Rüegg, J., Wang, O., Smolic, A., and Gross, M. H. 2013. DuctTake: Spatiotemporal video compositing. Comput. Graph. Forum 32, 2, 51–61.Cross Ref
31. Sung, K., Pearce, A., and Wang, C. 2002. Spatial-temporal antialiasing. IEEE Trans. Visualization and Computer Graphics 8, 2, 144–153.
32. Sunkavalli, K., Joshi, N., Kang, S. B., Cohen, M. F., and Pfister, H. 2012. Video snapshots: Creating high-quality images from video clips. IEEE Transactions on Visualization and Computer Graphics 18, 1868–1879.
33. Tomasi, C., and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. Int. J. Comput. Vision 9, 2, 137–154.
34. Tresadern, P., and Reid, I. 2005. Articulated structure from motion by factorization. In CVPR.
35. Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P. 2004. Image quality assessment: From error visibility to structural similarity. Trans. Img. Proc. 13, 4, 600–612.
36. Yan, J., and Pollefeys, M. 2005. A factorization-based approach to articulated motion recovery. In CVPR, 815–821.

