Bilinear Spatiotemporal Basis Models

A variety of dynamic objects, such as faces, bodies, and cloth, are represented in computer graphics as a collection of moving spatial landmarks. Spatiotemporal data is inherent in a number of graphics applications including animation, simulation, and object and camera tracking. The principal modes of variation in the spatial geometry of objects are typically modeled using dimensionality reduction techniques, while concurrently, trajectory representations like splines and autoregressive models are widely used to exploit the temporal regularity of deformation. In this article, we present the bilinear spatiotemporal basis as a model that simultaneously exploits spatial and temporal regularity while maintaining the ability to generalize well to new sequences. This factorization allows the use of analytical, predefined functions to represent temporal variation (e.g., B-Splines or the Discrete Cosine Transform) resulting in efficient model representation and estimation. The model can be interpreted as representing the data as a linear combination of spatiotemporal sequences consisting of shape modes oscillating over time at key frequencies. We apply the bilinear model to natural spatiotemporal phenomena, including face, body, and cloth motion data, and compare it in terms of compaction, generalization ability, predictive precision, and efficiency to existing models. We demonstrate the application of the model to a number of graphics tasks including labeling, gap-filling, denoising, and motion touch-up.

References:

Akhter, I., Sheikh, Y., Khan, S., and Kanade, T. 2008. Nonrigid structure from motion in trajectory space. In Advances in Neural Information Processing Systems.Google Scholar
Akhter, I., Sheikh, Y., Khan, S., and Kanade, T. 2010. Trajectory space: A dual representation for nonrigid structure from motion. IEEE Trans. Pattern Anal. Mach. Intell. Google ScholarDigital Library
Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., and Davis, J. 2005. SCAPE: Shape completion and animation of people. ACM Trans. Graph. 24, 3, 408–416. Google ScholarDigital Library
Arikan, O. 2006. Compression of motion capture databases. ACM Trans. Graph. 25, 3, 890–897. Google ScholarDigital Library
Bregler, C., Hertzmann, A., and Biermann, H. 2000. Recovering non-rigid 3D shape from image streams. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 690–696.Google Scholar
Bronstein, A., Bronstein, M., and Kimmel, R. 2008. Numerical Geometry of Non-Rigid Shapes. Springer. Google ScholarDigital Library
Chai, J. and Hodgins, J. K. 2005. Performance animation from low-dimensional control signals. ACM Trans. Graph. 24, 3, 686–696. Google ScholarDigital Library
Chuang, E. and Bregler, C. 2005. Mood swings: Expressive speech animation. ACM Trans. Graph. 24, 2, 331–347. Google ScholarDigital Library
Cootes, T. F., Taylor, C. J., Cooper, D. H., and Graham, J. 1995. Active shape models—Their training and application. Comput. Vis. Image Understand. 61, 1, 38–59. Google ScholarDigital Library
de Aguiar, E., Sigal, L., Treuille, A., and Hodgins, J. K. 2010. Stable spaces for real-time clothing. ACM Trans. Graph. 29, 4, 106:1–106:9. Google ScholarDigital Library
de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.-P., and Thrun, S. 2008. Performance capture from sparse multi-view video. ACM Trans. Graph. 27, 98:1–98:10. Google ScholarDigital Library
DeAngelis, G. C., Ohzawa, I., and Freeman, R. D. 1995. Receptive-field dynamics in the central visual pathways. Trends Neurosci. 18, 10, 451–458.Google ScholarCross Ref
Deboor, C. 1978. A Practical Guide to Splines. Springer.Google Scholar
Dryden, I. L. and Mardia, K. V. 1998. Statistical Shape Analysis. Wiley.Google Scholar
Gabaix, X. and Laibson, D. 2008. The seven properties of good models. In The Foundations of Positive and Normative Economics: A Handbook, A. Caplin and A. Schotter, Eds., Oxford University Press.Google Scholar
Gleicher, M. 1997. Motion editing with spacetime constraints. In Proceedings of the Symposium on Interactive 3D Graphics. 139–148. Google ScholarDigital Library
Gleicher, M. 1998. Retargetting motion to new characters. In Proceedings of SIGGRAPH. Annual Conference Series. 33–42. Google ScholarDigital Library
Gleicher, M. 2001. Comparing constraint-based motion editing methods. Graph. Models 63, 2. Google ScholarDigital Library
Gleicher, M. and Litwinowicz, P. 1998. Constraint-Based motion adaptation. J. Vis. Comput. Anim., 65–94.Google ScholarCross Ref
Gotardo, P. F. and Martinez, A. M. 2011. Computing smooth time trajectories for camera and deformable shape in structure from motion with occlusion. IEEE Trans. Pattern Anal. Mach. Intell. 33, 2051–2065. Google ScholarDigital Library
Hamarneh, G. and Gustavsson, T. 2004. Deformable spatio-temporal shape models: Extending active shape models to 2D+time. Image Vis. Comput. 22, 6, 461–470.Google ScholarCross Ref
Herda, L., Fua, P., Plankers, R., Boulic, R., and Thalmann, D. 2001. Using skeleton-based tracking to increase the reliability of optical motion capture. Hum. Move. Sci. 20, 3, 313–341.Google ScholarCross Ref
Hoogendoorn, C., Sukno, F., Orděs, S., and Frangi, A. 2009. BiLinear models for spatio-temporal point distribution analysis. Int. J. Comput. Vis. 85, 237–252. Google ScholarDigital Library
Hornung, A., Sar-Dessai, S., and Kobbelt, L. 2005. Self-calibrating optical motion tracking for articulated bodies. In Proceedings of Virtual Reality Conference (VR). IEEE, 75–82. Google ScholarDigital Library
Jain, A. 1989. Fundamentals of Digital Image Processing. Prentice-Hall, Upper Saddle River, NJ. Google ScholarDigital Library
Lawrence, N. D. 2004. Gaussian process latent variable models for visualisation of high dimensional data. In Advances in Neural Information Processing Systems.Google Scholar
Le, H. and Kendall, D. G. 1993. The riemannian structure of euclidean shape spaces: A novel environment for statistics. Ann. Statist 21, 3, 1225–1271.Google ScholarCross Ref
Lewis, J. P. and Anjyo, K.-i. 2010. Direct manipulation blendshapes. IEEE Comput. Graph. Appl. 30, 4, 42–50. Google ScholarDigital Library
Li, H., Weise, T., and Pauly, M. 2010a. Example-Based facial rigging. ACM Trans. Graph. 29, 4, 32:1–32:6. Google ScholarDigital Library
Li, L., McCann, J., Faloutsos, C., and Pollard, N. 2010b. Bolero: A principled technique for including bone length constraints in motion capture occlusion filling. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation. Google ScholarDigital Library
Li, L., McCann, J., Pollard, N. S., and Faloutsos, C. 2009. Dynammo: Mining and summarization of coevolving sequences with missing values. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 507–516. Google ScholarDigital Library
Liu, G. and McMillan, L. 2006. Estimation of missing markers in human motion capture. Vis. Comput. 22, 721–728. Google ScholarDigital Library
Lou, H. and Chai, J. 2010. Example-based human motion denoising. IEEE Trans. Vis. Comput. Graph. 16, 870–879. Google ScholarDigital Library
Magnus, J. R. and Neudecker, H. 1999. Matrix Differential Calculus with Applications in Statistics and Econometrics, 2nd Ed. John Wiley & Sons.Google Scholar
Mardia, K. V. and Dryden, I. L. 1989. Shape distributions for landmark data. Adv. Appl. Probab. 21, 4, 742–755.Google ScholarCross Ref
Min, J., Chen, Y.-L., and Chai, J. 2009. Interactive generation of human animation with deformable motion models. ACM Trans. Graph. 29, 1, 9:1–9:12. Google ScholarDigital Library
Min, J., Liu, H., and Chai, J. 2010. Synthesis and editing of personalized stylistic human motion. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games. 39–46. Google ScholarDigital Library
Mitchell, S., Bosch, J., Lelieveldt, B., van der Geest, R., Reiber, J., and Sonka, M. 2002. 3-D active appearance models: Segmentation of cardiac MR and ultrasound images. IEEE Trans. Med. Imaging 21, 9, 1167–1178.Google ScholarCross Ref
Park, S. I. and Hodgins, J. K. 2006. Capturing and animating skin deformation in human motion. ACM Trans. Graph. 25, 3, 881–889. Google ScholarDigital Library
Perperidis, D., Mohiaddin, R., and Rueckert, D. 2004. Spatio-Temporal free-form registration of cardiac MR image sequences. In Medical Image Computing and Computer-Assisted Intervention, C. Barillot, D. R. Haynor, and P. Hellier, Eds. Lecture Notes in Computer Science, vol. 3216. Springer, 911–919. Google ScholarDigital Library
Rao, K. and Yip, P. 1990. Discrete Cosine Transform: Algorithms, Advantages, Applications. Academic, New York. Google ScholarDigital Library
Safonova, A., Hodgins, J. K., and Pollard, N. S. 2004. Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces. ACM Trans. Graph. 23, 3, 514–521. Google ScholarDigital Library
Schölkopf, B., Smola, A. J., and Müller, K.-R. 1997. Kernel principal component analysis. In Proceedings of the International Conference on Artificial Neural Networks. 583–588. Google ScholarDigital Library
Sidenbladh, H., Black, M. J., and Fleet, D. J. 2000. Stochastic tracking of 3D human figures using 2D image motion. In Proceedings of the European Conference on Computer Vision. 702–718. Google ScholarDigital Library
Sigal, L., Fleet, D., Troje, N., and Livne, M. 2010. Human attributes from 3D pose tracking. In Proceedings of the European Conference on Computer Vision. 243–257. Google ScholarDigital Library
Sunkavalli, K., Matusik, W., Pfister, H., and Rusinkiewicz, S. 2007. Factored time-lapse video. ACM Trans. Graph. 26, 3, 101:1–101:10. Google ScholarDigital Library
Tenenbaum, J. B. and Freeman, W. T. 2000. Separating style and content with bilinear models. Neural Comput. 12, 1247–1283. Google ScholarDigital Library
Thrun, S., Burgard, W., and Fox, D. 2006. Probabilistic Robotics. Cambridge University Press.Google Scholar
Torresani, L. and Bregler, C. 2002. Space-Time tracking. In Proceedings of the European Conference on Computer Vision. 801–812. Google ScholarDigital Library
Troje, N. F. 2002. Decomposing biological motion: A framework for analysis and synthesis of human gait patterns. J. Vis. 2, 5 (9), 371–387.Google ScholarCross Ref
Urtasun, R., Glardon, P., Boulic, R., Thalmann, D., and Fua, P. 2004. Style-Based motion synthesis. Comput. Graph. Forum 23, 4, 799–812.Google ScholarCross Ref
Vasilescu, M. A. O. and Terzopoulos, D. 2004. TensorTextures: Multilinear image-based rendering. ACM Trans. Graph. 23, 3, 336–342. Google ScholarDigital Library
Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models. ACM Trans. Graph. 24, 3, 426–433. Google ScholarDigital Library
Wand, M., Jenke, P., Huang, Q., Bokeloh, M., Guibas, L., and Schilling, A. 2007. Reconstruction of deforming geometry from time-varying point clouds. In Proceedings of the 5th Eurographics Symposium on Geometry Processing. 49–58. Google ScholarDigital Library
Wang, H., Wu, Q., Shi, L., Yu, Y., and Ahuja, N. 2005. Out-of-core tensor approximation of multi-dimensional matrices of visual data. ACM Trans. Graph. 24, 3, 527–535. Google ScholarDigital Library
Wang, J., Fleet, D., and Aaron, H. 2008. Gaussian process dynamical models for human motion. IEEE Trans. Pattern Anal. Mach. Intell. 30, 283–298. Google ScholarDigital Library
White, R., Crane, K., and Forsyth, D. 2007. Capturing and animating occluded cloth. ACM Trans. Graph. 26. Google ScholarDigital Library
Witkin, A. and Kass, M. 1988. Spacetime constraints. In Proceedings of SIGGRAPH. 159–168. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2012: Technical Papers

“Bilinear Spatiotemporal Basis Models” by Akhter, Simon, Khan, Matthews and Sheikh

Conference:

Type(s):

Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: