Constraint-based Synthesis of Visual Speech

James D. Edge; Steve C. Maddock

“Constraint-based Synthesis of Visual Speech” by Edge and Maddock

Next: “Constraints methods for flexible models”... »

« Previous: “Constraint-based Simulation of Passive Suction...

Conference:

SIGGRAPH 2004

Type(s):

Talks (Sketches)

Title:

Constraint-based Synthesis of Visual Speech

Session/Category Title: Frowns, Smiles, Pouts

Presenter(s)/Author(s):

James D. Edge

Steve C. Maddock

Abstract:

This sketch concerns the animation of facial movement during speech production. In this work we consider speech gestures as trajectories through a space containing all visible vocal tract postures. Within this visible speech space, visual-phonemes (or visemes) are defined as collections of vocal tract postures which produce simi- lar speech sounds (i.e. an individual phoneme in audible speech). This definition is distinct from many techniques in which the terms viseme and morph-target could be used interchangably (e.g. [Cohen and Massaro 1993]). A speech trajectory will always interpolate the visemes corresponding to its phonetic structure (i.e. there is a direct mapping from audio → visual speech). However, as visemes are not individual targets we must determine how the trajectory passes through each of the visemes according to both physical constraints and context; this is the notion of coarticulation [Lofqvist 1990].

References:

Cohen, M., and Massaro, D. 1993. Modeling coarticulation in synthetic visual speech. Computer Animation ’93, 131–156.
Fung, Y. C. 1993. Biomechanics – Mechanical Properties of Living Tissues, second ed. Springer-Verlag.
Lee, Y., Terzopoulos, D., and Waters, K. 1995. Realistic modeling for facial animation. Computer Graphics 29, Annual Conference Series, 55–62.

ACM Digital Library Publication: