“Deeply Emotional Talking Head: A Generative Adversarial Network Approach to Expressive Speech Synthesis with Emotion Control” by Reis, Costa and Martino

  • ©Filipe Antonio de Barros Reis, Paula Dornhofer Paro Costa, and José Mario de Martino

  • ©Filipe Antonio de Barros Reis, Paula Dornhofer Paro Costa, and José Mario de Martino

Conference:


Type:


Entry Number: 33

Title:

    Deeply Emotional Talking Head: A Generative Adversarial Network Approach to Expressive Speech Synthesis with Emotion Control

Presenter(s)/Author(s):



Abstract:


    The recent development in natural language processing allowed the widespread use of voice-based virtual assistants on various tasks, ranging from personal assistants capable of helping on simple tasks to customer-facing assistants capable of understanding and solving personal issues with a given service. Although these assistants are capable of completing the task assigned, their interaction with humans still lacks many resources that humans adopt to communicate effectively. Speech is naturally multimodal and contains a non-verbal part since the structures used to generate sounds produce visible results during the speech, which can be interpreted and add information. In particular, the visual information carried by the face during speech is a crucial aspect of communicating emotion and improving the intelligibility of a message [Mattheyses and Verhelst 2015].


PDF:



ACM Digital Library Publication:



Overview Page: