Survey of Deep Representation Learning for Speech Emotion Recognition
Article
Article Title | Survey of Deep Representation Learning for Speech Emotion Recognition |
---|---|
ERA Journal ID | 200608 |
Article Category | Article |
Authors | Latif, Siddique (Author), Rana, Rajib (Author), Khalifa, Sara (Author), Jurdak, Raja (Author), Qadir, Junaid (Author) and Schuller, Bjorn (Author) |
Journal Title | IEEE Transactions on Affective Computing |
Journal Citation | 14 (2), pp. 1634-1654 |
Number of Pages | 21 |
Year | 2021 |
Publisher | IEEE (Institute of Electrical and Electronics Engineers) |
Place of Publication | United States |
ISSN | 1949-3045 |
Digital Object Identifier (DOI) | https://doi.org/10.1109/TAFFC.2021.3114365 |
Web Address (URL) | https://ieeexplore.ieee.org/document/9543566 |
Abstract | Traditionally, speech emotion recognition (SER) research has relied on manually handcrafted acoustic features using feature engineering. However, the design of handcrafted features for complex SER tasks requires significant manual eort, which impedes generalisability and slows the pace of innovation. This has motivated the adoption of representation learning techniques that can automatically learn an intermediate representation of the input signal without any manual feature engineering. Representation learning has led to improved SER performance and enabled rapid innovation. Its effectiveness has further increased with advances in deep learning (DL), which has facilitated \textit{deep representation learning} where hierarchical representations are automatically learned in a data-driven manner. This paper presents the first comprehensive survey on the important topic of deep representation learning for SER. We highlight various techniques, related challenges and identify important future areas of research. Our survey bridges the gap in the literature since existing surveys either focus on SER with hand-engineered features or representation learning in the general setting without focusing on SER. |
Keywords | speech emotion recognition, multi task learning, representation learning, domain adaptation, unsupervised learning |
Related Output | |
Is part of | Deep Representation Learning for Speech Emotion Recognition |
ANZSRC Field of Research 2020 | 461101. Adversarial machine learning |
460212. Speech recognition | |
461102. Context learning | |
461105. Reinforcement learning | |
461106. Semi- and unsupervised learning | |
461104. Neural networks | |
461103. Deep learning | |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
This article is part of a UniSQ Thesis by publication. See Related Output. | |
Byline Affiliations | School of Sciences |
Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia | |
Queensland University of Technology | |
Information Technology University, Pakistan | |
Imperial College London, United Kingdom | |
Institution of Origin | University of Southern Queensland |
https://research.usq.edu.au/item/q6q9x/survey-of-deep-representation-learning-for-speech-emotion-recognition
147
total views7
total downloads1
views this month0
downloads this month