Augmenting Generative Adversarial Networks for Speech Emotion Recognition
Paper
Paper/Presentation Title | Augmenting Generative Adversarial Networks for Speech Emotion Recognition |
---|---|
Presentation Type | Paper |
Authors | Latif, Siddique (Author), Asim, Muhammad (Author), Rana, Rajib (Author), Khalifa, Sara (Author), Jurdak, Raja (Author) and Schuller, Bjorn W. (Author) |
Journal or Proceedings Title | Proceedings of the 21st Annual Conference of the International Speech Communication Association (INTERSPEECH 2020) |
Journal Citation | 1, pp. 521-525 |
Article Number | 3194 |
Number of Pages | 5 |
Year | 2020 |
Place of Publication | France |
ISBN | 9781713820697 |
Digital Object Identifier (DOI) | https://doi.org/10.21437/Interspeech.2020-3194 |
Web Address (URL) of Paper | https://www.isca-speech.org/archive/interspeech_2020/latif20_interspeech.html |
Conference/Event | 21st Annual Conference of the International Speech Communication Association: Cognitive Intelligence for Speech Processing (INTERSPEECH 2020) |
Event Details | Rank A A A A A A A A |
Event Details | 21st Annual Conference of the International Speech Communication Association: Cognitive Intelligence for Speech
Processing (INTERSPEECH 2020) Event Date 25 to end of 29 Oct 2020 Event Location Shanghai, China |
Abstract | Generative adversarial networks (GANs) have shown potential in learning emotional attributes and generating new data samples. However, their performance is usually hindered by the unavailability of larger speech emotion recognition (SER) data. In this work, we propose a framework that utilises the mixup data augmentation scheme to augment the GAN in feature learning and generation. To show the effectiveness of the proposed framework, we present results for SER on (i) synthetic feature vectors, (ii) augmentation of the training data with synthetic features, (iii) encoded features in compressed representation. Our results show that the proposed framework can effectively learn compressed emotional representations as well as it can generate synthetic samples that help improve performance in within-corpus and cross-corpus evaluation. |
Keywords | speech emotion recognition, mixup, data augmentation, generative adversarial networks, feature learning,synthetic feature generation. |
Related Output | |
Is part of | Deep Representation Learning for Speech Emotion Recognition |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 460212. Speech recognition |
460208. Natural language processing | |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
This article is part of a UniSQ Thesis by publication. See Related Output. | |
Byline Affiliations | Institute for Resilient Regions |
Information Technology University, Pakistan | |
University of New South Wales | |
Queensland University of Technology | |
Imperial College London, United Kingdom | |
Institution of Origin | University of Southern Queensland |
https://research.usq.edu.au/item/q63y7/augmenting-generative-adversarial-networks-for-speech-emotion-recognition
145
total views11
total downloads1
views this month0
downloads this month