Spacetime faces: high resolution capture for modeling and animation

Li Zhang; Noah Snavely; Brian Curless; Steven M. Seitz

“Spacetime faces: high resolution capture for modeling and animation” by Zhang, Snavely, Curless and Seitz

Next: “SPACE|R A C E” »

« Previous: “Spacetime Expression Cloning for...

Conference:

SIGGRAPH 2004

Type(s):

Technical Papers

Title:

Spacetime faces: high resolution capture for modeling and animation

Presenter(s)/Author(s):

Li Zhang

Noah Snavely

Brian Curless

Steven M. Seitz

Abstract:

We present an end-to-end system that goes from video sequences to high resolution, editable, dynamically controllable face models. The capture system employs synchronized video cameras and structured light projectors to record videos of a moving face from multiple viewpoints. A novel spacetime stereo algorithm is introduced to compute depth maps accurately and overcome over-fitting deficiencies in prior work. A new template fitting and tracking procedure fills in missing data and yields point correspondence across the entire sequence without using markers. We demonstrate a data-driven, interactive method for inverse kinematics that draws on the large set of fitted templates and allows for posing new expressions by dragging surface points directly. Finally, we describe new tools that model the dynamics in the input sequence to enable new animations, created via key-framing or texture-synthesis techniques.

References:

1. ALLEN, B., CURLESS, B., AND POPOVIC, Z. 2003. The space of human body shapes: reconstruction and parameterization from range scans. In SIGGRAPH Conference Proceedings, 587–594. Google ScholarDigital Library
2. ARIKAN, O., AND FORSYTH, D. A. 2002. Synthesizing constrained motions from examples. In SIGGRAPH Conference Proceedings, 483–490.Google Scholar
3. BAKER, S., GROSS, R., AND MATTHEWS, I. 2003. Lucas-kanade 20 years on: A unifying framework: Part 3. Tech. Rep. CMU-RI-TR-03-35, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, November.Google Scholar
4. BASU, S., OLIVER, N., AND PENTLAND, A. 1998. 3D lip shapes from video: A combined physical-statistical model. Speech Communication 26, 1, 131–148. Google ScholarDigital Library
5. BLACK, M. J., AND ANANDAN, P. 1993. Robust dense optical flow. In Proc. Int. Conf. on Computer Vision, 231–236.Google Scholar
6. BLANZ, V., AND VETTER, T. 1999. A morphable model for the synthesis of 3D faces. IN SIGGRAPH Conference Proceedings, 187–194. Google ScholarDigital Library
7. BLANZ, V., BASSO, C., POGGIO, T., AND VETTER, T. 2003. Reanimating faces in images and video. In Proceedings of EUROGRAPHICS, vol. 22, 641–650.Google ScholarCross Ref
8. BOUGUET, J.-Y. 2001. Camera Calibration Toolbox for Matlab. http://www.vision.caltech.edu/bouguetj/calib_doc/index.html.Google Scholar
9. BRAND, M. 1999. Voice puppetry. In SIGGRAPH Conference Proceedings, 21–28. Google ScholarDigital Library
10. BRAND, M. 2001. Morphable 3D models from video. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 456–463.Google ScholarCross Ref
11. BREGLER, C., COVELL, M., AND SLANEY, M. 1997. Video rewrite: Visual speech synthesis from video. In SIGGRAPH Conference Proceedings, 353–360. Google ScholarDigital Library
12. BROOMHEAD, D. S., AND LOWE, D. 1988. Multivariable functional interpolation and adptive networks. Complex Systems 2, 321–355.Google Scholar
13. CHAI, J., JIN, X., AND HODGINS, J. 2003. Vision-based control of 3D facial animation. In Proceedings of Eurographics/SIGGRAPH Symposium on Computer Animation, 193–206. Google ScholarDigital Library
14. COOTES, T. F., TAYLOR, C. J., COOPER, D. H., AND GRAHAM, J. 1995. Active shape models—their training and application. Computer Vision and Image Understanding 61, 1, 38–59. Google ScholarDigital Library
15. CURLESS, B., AND LEVOY, M. 1996. A volumetric method for building complex models from range images. In SIGGRAPH Conference Proceedings, 303–312. Google ScholarDigital Library
16. DAVIS, J., RAMAMOORTHI, R., AND RUSINKIEWICZ, S. 2003. Spacetime stereo: A unifying framework for depth from triangulation. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 359–366.Google ScholarCross Ref
17. DECARLO, D., AND METAXAS, D. 2002. Adjusting shape parameters using model-based optical flow residuals. IEEE Trans. on Pattern Analysis and Machine Intelligence 24, 6, 814–823. Google ScholarDigital Library
18. ESSA, I., BASU, S., DARRELL, T., AND PENTLAND, A. 1996. Modeling, tracking and interactive animation of faces and heads using input from video. In Proceedings of the Computer Animation, IEEE Computer Society, 68–79. Google ScholarDigital Library
19. EZZAT, T., GEIGER, G., AND POGGIO, T. 2002. Trainable videorealistic speech animation. In SIGGRAPH Conference Proceedings, 388–398. Google ScholarDigital Library
20. FAUGERAS, O. 1993. Three-Dimensional Computer Vision. MIT Press. Google ScholarDigital Library
21. GUENTER, B., GRIMM, C., WOOD, D., MALVAR, H., AND PIGHIN, F. 1998. Making faces. In SIGGRAPH Conference Proceedings, 55–66. Google ScholarDigital Library
22. HUANG, P. S., ZHANG, C. P., AND CHIANG, F. P. 2003. High speed 3-d shape measurement based on digital fringe projection. Optical Engineering 42, 1, 163–168.Google ScholarCross Ref
23. JOSHI, P., TIEN, W. C., DESBRUN, M., AND PIGHIN, F. 2003. Learning controls for blend shape based realistic facial animation. In Proceedings of Eurographics/SIGGRAPH Symposium on Computer Animation, 187–192. Google ScholarDigital Library
24. KANADE, T., AND OKUTOMI, M. 1994. A stereo matching algorithm with an adaptive window: Theory and experiment. IEEE Trans. on Pattern Analysis and Machine Intelligence 16, 9, 920–932. Google ScholarDigital Library
25. KOVAR, L., GLEICHER, M., AND PIGHIN, F. 2002. Motion graphs. In SIGGRAPH Conference Proceedings, 473–482. Google ScholarDigital Library
26. KOZEN, D. C. 1992. The Design and Analysis of Algorithms. Springer. Google ScholarDigital Library
27. LEE, J., CHAI, J., REITSMA, P. S. S., HODGINS, J. K., AND POLLARD, N. S. 2002. Interactive control of avatars animated with human motion data. In SIGGRAPH Conference Proceedings, 491–500. Google ScholarDigital Library
28. LI, Y., WANG, T., AND SHUM, H.-Y. 2002. Motion texture: A two-level statistical model for character motion synthesis. In SIGGRAPH Conference Proceedings, 465–472. Google ScholarDigital Library
29. NAYAR, S. K., WATANABE, M., AND NOGUCHI, M. 1996. Real-time focus range sensor. IEEE Transactions on Pattern Analysis and Machine Intelligence 18, 12, 1186–1198. Google ScholarDigital Library
30. NOCEDAL, J., AND WRIGHT, S. J. 1999. Numerical Optimization. Springer.Google Scholar
31. PARKE, F. I. 1972. Computer generated animation of faces. In Proceedings of the ACM annual conference, ACM Press, 451–457. Google ScholarDigital Library
32. PIGHIN, F., HECKER, J., LISCHINSKI, D., SALESIN, D. H., AND SZELISKI, R. 1998. Synthesizing realistic facial expressions from photographs. In SIGGRAPH Conference Proceedings, 75–84. Google ScholarDigital Library
33. PIGHIN, F., SALESIN, D. H., AND SZELISKI, R. 1999. Resynthesizing facial animation through 3D model-based tracking. In Proc. Int. Conf. on Computer Vision, 143–150.Google ScholarCross Ref
34. PRESS, W. H., FLANNERY, B. P., TEUKOLSKY, S. A., AND VETTERLING, W. T. 1993. Numerical Recipes in C: The Art of Scientific Computing, 2nd ed. Cambridge University Press. Google ScholarDigital Library
35. PROESMANS, M., GOOL, L. V., AND OOSTERLINCK, A. 1996. One-shot active 3D shape acquization. In Proc. Int. Conf. on Pattern Recognition, 336–340. Google ScholarDigital Library
36. PULLI, K., AND GINZTON, M. 2002. Scanalyze. http://graphics.stanford.edu/software/scanalyze/.Google Scholar
37. RASKAR, R., WELCH, G., CUTTS, M., LAKE, A., STESIN, L., AND FUCHS, H. 1998. The office of the future: A unified approach to image-based modeling and spatially immersive displays. In SIGGRAPH Conference Proceedings, 179–188. Google ScholarDigital Library
38. SCHARSTEIN, D., AND SZELISKI, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. on Computer Vision 47, 1, 7–42. Google ScholarDigital Library
39. SCHÖDL, A., AND ESSA, I. A. 2002. Controlled animation of video sprites. In Proceedings of Eurographics/SIGGRAPH Symposium on Computer Animation, ACM Press, 121–127. Google ScholarDigital Library
40. SCHÖDL, A., SZELISKI, S., SALESIN, D. H., AND ESSA, I. 2000. Video textures. In SIGGRAPH Conference Proceedings, 489–498. Google ScholarDigital Library
41. TORRESANI, L., YANG, D. B., ALEXANDER, E. J., AND BREGLER, C. 2001. Tracking and modeling non-rigid objects with rank constraints. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 493–500.Google ScholarCross Ref
42. VEDULA, S., BAKER, S., RANDER, P., COLLINS, R., AND KANADE, T. 1999. Three-dimensional scene flow. In Proc. Int. Conf. on Computer Vision, 722–729. Google ScholarDigital Library
43. ZHANG, L., CURLESS, B., AND SEITZ, S. M. 2003. Spacetime stereo: Shape recovery for dynamic scenes. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 367–374.Google ScholarCross Ref
44. ZHANG, Q., LIU, Z., GUO, B., AND SHUM, H. 2003. Geometry-driven photo-realistic facial expression synthesis. In Proceedings of Eurographics/SIGGRAPH Symposium on Computer Animation, 177–186. Google ScholarDigital Library

ACM Digital Library Publication: