“Constraint-based Synthesis of Visual Speech”

    Constraint-based Synthesis of Visual Speech


    This sketch concerns the animation of facial movement during speech production. In this work we consider speech gestures as trajectories through a space containing all visible vocal tract postures. Within this visible speech space, visual-phonemes (or visemes) are defined as collections of vocal tract postures which produce simi- lar speech sounds (i.e. an individual phoneme in audible speech). This definition is distinct from many techniques in which the terms viseme and morph-target could be used interchangably (e.g. [Cohen and Massaro 1993]). A speech trajectory will always interpolate the visemes corresponding to its phonetic structure (i.e. there is a direct mapping from audio → visual speech). However, as visemes are not individual targets we must determine how the trajectory passes through each of the visemes according to both physical constraints and context; this is the notion of coarticulation [Lofqvist 1990].


