“Multi-Modal Translation System by Using Automatic Facial Image Tracking and Model-Based Lip Synchronization” by Ogata, Misawa, Murai, Nakamura and Morishima

  • ©Shin Ogata, Takafumi Misawa, Kazumasa Murai, Satoshi Nakamura, and Shigeo Morishima

Conference:


Type:


Interest Area:


    Application

Title:

    Multi-Modal Translation System by Using Automatic Facial Image Tracking and Model-Based Lip Synchronization

Session/Category Title:   Human Capture


Presenter(s)/Author(s):



Abstract:


    This sketch introduces a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker’s physical speech motion by synchronizing it to the translated speech.

References:


    1. Takezawa, T., Morimoto, T., Sagisaka, Y., Campbell, N., Iida, H., Sugaya, F., Yokoo, A., & Yamamoto, S. A. (1998). Japanese-to-English speech translation system ATR-MATRIX. Proceedings International Conference of Spoken Language Processing, ICSLP, 957-960.
    2. Campbell, N. & Black, A.W. (1995). CHATR : a multi-lingual speech re-sequencing synthesis system. IEICE Technical Report, sp96-7, 45.
    3. Ogata, S., Nakamura, S., & Morishima, S. (2001). Multi-modal translation system – model based lip synchronization with automatically translated synthetic voice. IPSJ Interaction 2001, 203.


Overview Page: