Multi-Modal Translation System by Using Automatic Facial Image Tracking and Model-Based Lip Synchronization

This sketch introduces a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker’s physical speech motion by synchronizing it to the translated speech.

References:

1. Takezawa, T., Morimoto, T., Sagisaka, Y., Campbell, N., Iida, H., Sugaya, F., Yokoo, A., & Yamamoto, S. A. (1998). Japanese-to-English speech translation system ATR-MATRIX. Proceedings International Conference of Spoken Language Processing, ICSLP, 957-960.
2. Campbell, N. & Black, A.W. (1995). CHATR : a multi-lingual speech re-sequencing synthesis system. IEICE Technical Report, sp96-7, 45.
3. Ogata, S., Nakamura, S., & Morishima, S. (2001). Multi-modal translation system – model based lip synchronization with automatically translated synthetic voice. IPSJ Interaction 2001, 203.

Overview Page:

SIGGRAPH 2001: Sketches and Applications

“Multi-Modal Translation System by Using Automatic Facial Image Tracking and Model-Based Lip Synchronization” by Ogata, Misawa, Murai, Nakamura and Morishima

Conference:

Type:

Interest Area:

Title:

Session/Category Title: Human Capture

Presenter(s)/Author(s):

Abstract:

References:

Overview Page:

Sponsored by: