“Exploring photobios” by Kemelmacher-Shlizerman, Shechtman, Garg and Seitz
Conference:
Type(s):
Title:
- Exploring photobios
Presenter(s)/Author(s):
Abstract:
We present an approach for generating face animations from large image collections of the same person. Such collections, which we call photobios, sample the appearance of a person over changes in pose, facial expression, hairstyle, age, and other variations. By optimizing the order in which images are displayed and cross-dissolving between them, we control the motion through face space and create compelling animations (e.g., render a smooth transition from frowning to smiling). Used in this context, the cross dissolve produces a very strong motion effect; a key contribution of the paper is to explain this effect and analyze its operating range. The approach operates by creating a graph with faces as nodes, and similarities as edges, and solving for walks and shortest paths on this graph. The processing pipeline involves face detection, locating fiducials (eyes/nose/mouth), solving for pose, warping to frontal views, and image comparison based on Local Binary Patterns. We demonstrate results on a variety of datasets including time-lapse photography, personal photo collections, and images of celebrities downloaded from the Internet. Our approach is the basis for the Face Movies feature in Google’s Picasa.
References:
1. Adelson, E. H., and Movshon, J. A. 1982. Phenomenal coherence of moving visual patterns. Nature 300, 5892, 523–525.Google Scholar
2. Ahonen, T., Hadid, A., and Pietikinen, M. 2006. Face description with local binary pat- terns: Application to face recognition. In IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 28, 2037–2041. Google ScholarDigital Library
3. Arikan, O., and Forsyth, D. A. 2002. Interactive motion generation from examples. ACM Trans. Graph. 21, 3, 483–490. Google ScholarDigital Library
4. Bederson, B. B. 2001. Photomesa: a zoomable image browser using quantum treemaps and bubblemaps. In UIST, 71–80. Google Scholar
5. Beier, T., and Neely, S. 1992. Feature-based image metamorphosis. 35–42.Google Scholar
6. Berg, T. L., Berg, A. C., Edwards, J., Maire, M., White, R., Teh, Y.-W., Learned-Miller, E., and Forsyth, D. A. 2004. Names and faces in the news. In CVPR, 848–854. Google Scholar
7. Bitouk, D., Kumar, N., Dhillon, S., Belhumeur, P., and Nayar, S. K. 2008. Face swapping: automatically replacing faces in photographs. In SIGGRAPH, 1–8. Google Scholar
8. Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In SIGGRAPH, 187–194. Google Scholar
9. Bourdev, L., and Brandt, J. 2005. Robust object detection via soft cascade. CVPR. Google Scholar
10. Bregler, C., Covell, M., and Slaney, M. 1997. Video rewrite: Driving visual speech with audio. In SIGGRAPH, 75–84. Google Scholar
11. Chen, S. E., and Williams, L. 1993. View interpolation for image synthesis. In SIGGRAPH, 279–288. Google Scholar
12. Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. In CVPR, 886–893. Google Scholar
13. Everingham, M., Sivic, J., and Zisserman, A. 2006. “Hello! My name is… Buffy” — automatic naming of characters in TV video. In Proceedings of the British Machine Vision Conference.Google Scholar
14. Freeman, W. T., Adelson, E. H., and Heeger, D. J. 1991. Motion without movement. Computer Graphics 25, 27–30. Google ScholarDigital Library
15. Goldman, D. B., Gonterman, C., Curless, B., Salesin, D., and Seitz, S. M. 2008. Video object annotation, navigation, and composition. In UIST, 3–12. Google Scholar
16. Graham, A., Garcia-Molina, H., Paepcke, A., and Winograd, T. 2002. Time as essence for photo browsing through personal digital libraries. In JCDL, 326–335. Google Scholar
17. Huang, G. B., Ramesh, M., Berg, T., and Learned-Miller, E. 2007. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Tech. Rep. 07-49, University of Massachusetts, Amherst.Google Scholar
18. Huynh, D. F., Drucker, S. M., Baudisch, P., and Wong, C. 2005. Time quilt: scaling up zoomable photo browsers for large, unstructured photo collections. In CHI, 1937–1940. Google Scholar
19. Joshi, N., Szeliski, R., and Kriegman, D. J. 2008. Psf estimation using sharp edge prediction. In CVPR.Google Scholar
20. Katz, S., Tal, A., and Basri, R. 2007. Direct visibility of point sets. SIGGRAPH 26, 3. Google Scholar
21. Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., and Seitz, S. M. 2010. Being John Malkovich. ECCV. Google Scholar
22. Kenkel, F. 1913. Untersuchungenber den zusammenhang zwischen erscheinungsgrsse und ersehrinungsbewegung bei einigen sogenannten optischen tuschungen. Z. Psychol. 67, 358–449.Google Scholar
23. Kovar, L., Gleicher, M., and Pighin, F. 2002. Motion graphs. In SIGGRAPH, 473–482. Google Scholar
24. Kumar, N., Belhumeur, P., and Nayar, S. 2008. Facetracer: A search engine for large collections of images with faces. In ECCV, 340–353. Google Scholar
25. Lasseter, J. 1987. Principles of traditional animation applied to 3D computer animation. In Proc. SIGGRAPH 87, 35–44. Google ScholarDigital Library
26. Levoy, M., and Hanrahan, P. 1996. Light field rendering. In SIGGRAPH, 31–42. Google Scholar
27. Lu, Z.-L., and Sperling, G. 2002. Three systems theory of human visual motion perception. JOSA A 19, 2, 413–413.Google ScholarCross Ref
28. Lucas, B., and Kanade, T. 1981. An iterative image registration technique with an application to stereo vision. Proceedings of Imaging Understanding Workshop, 121–130. Google ScholarDigital Library
29. Marr, D., and Hildreth, E. 1980. Theory of edge detection. Proc. R. Soc. Lond. B 207, 187–217.Google ScholarCross Ref
30. Nalwa, V. S., and Binford, T. O. 1986. On detecting edges. IEEE Trans. Pattern Anal. Mach. Intell. 8, 699–714. Google ScholarDigital Library
31. Ojala, T., Pietikinen, M., and Menp, T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, 971–987. Google ScholarDigital Library
32. Pentland, A., Picard, R. W., and Sclaroff, S. 1996. Photobook: Content-based manipulation of image databases. International Journal of Computer Vision 18, 3, 233–254. Google ScholarDigital Library
33. Picasa, 2010. http://googlephotos.blogspot.com/2010/08/picasa-38-face-movies-picnik.html.Google Scholar
34. Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., and Salesin, D. H. 1998. Synthesizing realistic facial expressions from photographs. In SIGGRAPH, 75–84. Google Scholar
35. Rekimoto, J. 1999. Time-machine computing: a time-centric approach for the information environment. In UIST, 45–54. Google Scholar
36. Seitz, S. M., and Dyer, C. R. 1996. View morphing. In SIGGRAPH, 21–30. Google Scholar
37. Shashua, A. 1992. Geometry and Photometry in 3D Visual Recognition. PhD thesis, Massachusetts Institute Of Technology, Cambridge, MA. Google Scholar
38. Snavely, N., Garg, R., Seitz, S. M., and Szeliski, R. 2008. Finding paths through the world’s photos. ACM Trans. Graph. 27, 3, 1–11. Google ScholarDigital Library
39. Szeliski, R., and Shum, H.-Y. 1997. Creating full view panoramic image mosaics and environment maps. In Proc. SIGGRAPH 97, 251–258. Google ScholarDigital Library
40. Wertheimer, M. 1912. Experimentelle studien uber das sehen von bewegung. Z. Psychol. 61, 161–265.Google Scholar
41. Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: high resolution capture for modeling and animation. In SIGGRAPH, 548–558. Google Scholar