“Exploring photobios” by Kemelmacher-Shlizerman, Shechtman, Garg and Seitz

  • ©Ira Kemelmacher-Shlizerman, Eli Shechtman, Rahul Garg, and Steven Seitz




    Exploring photobios



    We present an approach for generating face animations from large image collections of the same person. Such collections, which we call photobios, sample the appearance of a person over changes in pose, facial expression, hairstyle, age, and other variations. By optimizing the order in which images are displayed and cross-dissolving between them, we control the motion through face space and create compelling animations (e.g., render a smooth transition from frowning to smiling). Used in this context, the cross dissolve produces a very strong motion effect; a key contribution of the paper is to explain this effect and analyze its operating range. The approach operates by creating a graph with faces as nodes, and similarities as edges, and solving for walks and shortest paths on this graph. The processing pipeline involves face detection, locating fiducials (eyes/nose/mouth), solving for pose, warping to frontal views, and image comparison based on Local Binary Patterns. We demonstrate results on a variety of datasets including time-lapse photography, personal photo collections, and images of celebrities downloaded from the Internet. Our approach is the basis for the Face Movies feature in Google’s Picasa.


    1. Adelson, E. H., and Movshon, J. A. 1982. Phenomenal coherence of moving visual patterns. Nature 300, 5892, 523–525.Google Scholar
    2. Ahonen, T., Hadid, A., and Pietikinen, M. 2006. Face description with local binary pat- terns: Application to face recognition. In IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 28, 2037–2041. Google ScholarDigital Library
    3. Arikan, O., and Forsyth, D. A. 2002. Interactive motion generation from examples. ACM Trans. Graph. 21, 3, 483–490. Google ScholarDigital Library
    4. Bederson, B. B. 2001. Photomesa: a zoomable image browser using quantum treemaps and bubblemaps. In UIST, 71–80. Google Scholar
    5. Beier, T., and Neely, S. 1992. Feature-based image metamorphosis. 35–42.Google Scholar
    6. Berg, T. L., Berg, A. C., Edwards, J., Maire, M., White, R., Teh, Y.-W., Learned-Miller, E., and Forsyth, D. A. 2004. Names and faces in the news. In CVPR, 848–854. Google Scholar
    7. Bitouk, D., Kumar, N., Dhillon, S., Belhumeur, P., and Nayar, S. K. 2008. Face swapping: automatically replacing faces in photographs. In SIGGRAPH, 1–8. Google Scholar
    8. Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In SIGGRAPH, 187–194. Google Scholar
    9. Bourdev, L., and Brandt, J. 2005. Robust object detection via soft cascade. CVPR. Google Scholar
    10. Bregler, C., Covell, M., and Slaney, M. 1997. Video rewrite: Driving visual speech with audio. In SIGGRAPH, 75–84. Google Scholar
    11. Chen, S. E., and Williams, L. 1993. View interpolation for image synthesis. In SIGGRAPH, 279–288. Google Scholar
    12. Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. In CVPR, 886–893. Google Scholar
    13. Everingham, M., Sivic, J., and Zisserman, A. 2006. “Hello! My name is… Buffy” — automatic naming of characters in TV video. In Proceedings of the British Machine Vision Conference.Google Scholar
    14. Freeman, W. T., Adelson, E. H., and Heeger, D. J. 1991. Motion without movement. Computer Graphics 25, 27–30. Google ScholarDigital Library
    15. Goldman, D. B., Gonterman, C., Curless, B., Salesin, D., and Seitz, S. M. 2008. Video object annotation, navigation, and composition. In UIST, 3–12. Google Scholar
    16. Graham, A., Garcia-Molina, H., Paepcke, A., and Winograd, T. 2002. Time as essence for photo browsing through personal digital libraries. In JCDL, 326–335. Google Scholar
    17. Huang, G. B., Ramesh, M., Berg, T., and Learned-Miller, E. 2007. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Tech. Rep. 07-49, University of Massachusetts, Amherst.Google Scholar
    18. Huynh, D. F., Drucker, S. M., Baudisch, P., and Wong, C. 2005. Time quilt: scaling up zoomable photo browsers for large, unstructured photo collections. In CHI, 1937–1940. Google Scholar
    19. Joshi, N., Szeliski, R., and Kriegman, D. J. 2008. Psf estimation using sharp edge prediction. In CVPR.Google Scholar
    20. Katz, S., Tal, A., and Basri, R. 2007. Direct visibility of point sets. SIGGRAPH 26, 3. Google Scholar
    21. Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., and Seitz, S. M. 2010. Being John Malkovich. ECCV. Google Scholar
    22. Kenkel, F. 1913. Untersuchungenber den zusammenhang zwischen erscheinungsgrsse und ersehrinungsbewegung bei einigen sogenannten optischen tuschungen. Z. Psychol. 67, 358–449.Google Scholar
    23. Kovar, L., Gleicher, M., and Pighin, F. 2002. Motion graphs. In SIGGRAPH, 473–482. Google Scholar
    24. Kumar, N., Belhumeur, P., and Nayar, S. 2008. Facetracer: A search engine for large collections of images with faces. In ECCV, 340–353. Google Scholar
    25. Lasseter, J. 1987. Principles of traditional animation applied to 3D computer animation. In Proc. SIGGRAPH 87, 35–44. Google ScholarDigital Library
    26. Levoy, M., and Hanrahan, P. 1996. Light field rendering. In SIGGRAPH, 31–42. Google Scholar
    27. Lu, Z.-L., and Sperling, G. 2002. Three systems theory of human visual motion perception. JOSA A 19, 2, 413–413.Google ScholarCross Ref
    28. Lucas, B., and Kanade, T. 1981. An iterative image registration technique with an application to stereo vision. Proceedings of Imaging Understanding Workshop, 121–130. Google ScholarDigital Library
    29. Marr, D., and Hildreth, E. 1980. Theory of edge detection. Proc. R. Soc. Lond. B 207, 187–217.Google ScholarCross Ref
    30. Nalwa, V. S., and Binford, T. O. 1986. On detecting edges. IEEE Trans. Pattern Anal. Mach. Intell. 8, 699–714. Google ScholarDigital Library
    31. Ojala, T., Pietikinen, M., and Menp, T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, 971–987. Google ScholarDigital Library
    32. Pentland, A., Picard, R. W., and Sclaroff, S. 1996. Photobook: Content-based manipulation of image databases. International Journal of Computer Vision 18, 3, 233–254. Google ScholarDigital Library
    33. Picasa, 2010. http://googlephotos.blogspot.com/2010/08/picasa-38-face-movies-picnik.html.Google Scholar
    34. Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., and Salesin, D. H. 1998. Synthesizing realistic facial expressions from photographs. In SIGGRAPH, 75–84. Google Scholar
    35. Rekimoto, J. 1999. Time-machine computing: a time-centric approach for the information environment. In UIST, 45–54. Google Scholar
    36. Seitz, S. M., and Dyer, C. R. 1996. View morphing. In SIGGRAPH, 21–30. Google Scholar
    37. Shashua, A. 1992. Geometry and Photometry in 3D Visual Recognition. PhD thesis, Massachusetts Institute Of Technology, Cambridge, MA. Google Scholar
    38. Snavely, N., Garg, R., Seitz, S. M., and Szeliski, R. 2008. Finding paths through the world’s photos. ACM Trans. Graph. 27, 3, 1–11. Google ScholarDigital Library
    39. Szeliski, R., and Shum, H.-Y. 1997. Creating full view panoramic image mosaics and environment maps. In Proc. SIGGRAPH 97, 251–258. Google ScholarDigital Library
    40. Wertheimer, M. 1912. Experimentelle studien uber das sehen von bewegung. Z. Psychol. 61, 161–265.Google Scholar
    41. Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: high resolution capture for modeling and animation. In SIGGRAPH, 548–558. Google Scholar

ACM Digital Library Publication: