Speech Synthesis with Mixed Emotions
Article
Article Title | Speech Synthesis with Mixed Emotions |
---|---|
ERA Journal ID | 200608 |
Article Category | Article |
Authors | Zhou, Kun, Sisman, Berrak, Rana, R., Schuller, Bjorn W. and Li, Haizhou |
Journal Title | IEEE Transactions on Affective Computing |
Journal Citation | 14 (4), pp. 3120-3134 |
Number of Pages | 15 |
Year | 2023 |
Publisher | IEEE (Institute of Electrical and Electronics Engineers) |
Place of Publication | United States |
ISSN | 1949-3045 |
Digital Object Identifier (DOI) | https://doi.org/10.1109/TAFFC.2022.3233324 |
Web Address (URL) | https://ieeexplore.ieee.org/document/10003644 |
Abstract | Emotional speech synthesis aims to synthesize human voices with various emotional effects. The current studies are mostly focused on imitating an averaged style belonging to a specific emotion type. In this paper, we seek to generate speech with a mixture of emotions at run-time. We propose a novel formulation that measures the relative difference between the speech samples of different emotions. We then incorporate our formulation into a sequence-to-sequence emotional text-to-speech framework. During the training, the framework does not only explicitly characterize emotion styles but also explores the ordinal nature of emotions by quantifying the differences with other emotions. At run-time, we control the model to produce the desired emotion mixture by manually defining an emotion attribute vector. The objective and subjective evaluations have validated the effectiveness of the proposed framework. To our best knowledge, this research is the first study on modelling, synthesizing, and evaluating mixed emotions in speech. |
Keywords | Speech synthesis; Wheels; Hidden Markov models; Training; Psychology; Emotion recognition; Electronic mail |
ANZSRC Field of Research 2020 | 460299. Artificial intelligence not elsewhere classified |
Byline Affiliations | National University of Singapore |
University of Texas at Dallas, United States | |
University of Southern Queensland | |
Imperial College London, United Kingdom |
https://research.usq.edu.au/item/qv4wy/speech-synthesis-with-mixed-emotions
Download files
Published Version
Speech_Synthesis_With_Mixed_Emotions.pdf | ||
License: CC BY 4.0 | ||
File access level: Anyone |
76
total views115
total downloads1
views this month1
downloads this month