“Video Rewrite: driving visual speech with audio” by Bregler, Covell and Slaney

  • ©Christoph (Chris) Bregler, Michele Covell, and Malcolm Slaney




    Video Rewrite: driving visual speech with audio



    Video Rewrite uses existing footage to create automatically new video of a person mouthing words that she did not speak in the original footage. This technique is useful in movie dubbing, for example, where the movie sequence can be modified to sync the actors’ lip motions to the new soundtrack. Video Rewrite automatically labels the phonemes in the training data and in the new audio track. Video Rewrite reorders the mouth images in the training footage to match the phoneme sequence of the new audio track. When particular phonemes are unavailable in the training footage, Video Rewrite selects the closest approximations. The resulting sequence of mouth images is stitched into the background footage. This stitching process automatically corrects for differences in head position and orientation between the mouth images and the background footage. Video Rewrite uses computer-vision techniques to track points on the speaker’s mouth in the training footage, and morphing techniques to combine these mouth gestures into the final video sequence. The new video combines the dynamics of the original actor’s articulations with the mannerisms and setting dictated by the background footage. Video Rewrite is the first facial-animation system to automate all the labeling and assembly tasks required to resync existing footage to a new soundtrack.

ACM Digital Library Publication:

Overview Page: