Transfer learning for improving speech emotion classification accuracy
Paper
Paper/Presentation Title | Transfer learning for improving speech emotion classification accuracy |
---|---|
Presentation Type | Paper |
Authors | Latif, Siddique (Author), Rana, Rajib (Author), Younis, Shahzad (Author), Qadir, Junaid (Author) and Epps, Julien (Author) |
Journal or Proceedings Title | Proceedings of the 19th Annual Conference of the International Speech Communication Association (INTERSPEECH 2018) |
Number of Pages | 5 |
Year | 2018 |
Place of Publication | France |
Digital Object Identifier (DOI) | https://doi.org/10.21437/Interspeech.2018-1625 |
Web Address (URL) of Paper | https://www.isca-speech.org/archive/interspeech_2018/latif18b_interspeech.html |
Conference/Event | 19th Annual Conference of the International Speech Communication Association: Speech Research for Emerging Markets in Multilingual Societies (INTERSPEECH 2018) |
Event Details | Rank A A A A A A A |
Event Details | 19th Annual Conference of the International Speech Communication Association: Speech Research for Emerging Markets in Multilingual Societies (INTERSPEECH 2018) Event Date 02 to end of 06 Sep 2018 Event Location Hyderabad, India |
Abstract | The majority of existing speech emotion recognition research focuses on automatic emotion detection using training and testing data from the same corpus collected under the same conditions. The performance of such systems has been shown to drop significantly in cross-corpus and cross-language scenarios. To address the problem, this paper exploits a transfer learning technique to improve the performance of speech emotion recognition systems that are novel in cross-language and cross-corpus scenarios. Evaluations on five different corpora in three different languages show that Deep Belief Networks (DBNs) offer better accuracy than previous approaches on cross-corpus emotion recognition, relative to a Sparse Autoencoder and Support Vector Machine (SVM) baseline system. Results also suggest that using a large number of languages for training and using a small fraction of the target data in training can significantly boost accuracy compared with baseline also for the corpus with limited training examples. |
Keywords | transfer learning, cross-corpus, deep belief networks, sparse autoencoder, support vector machine |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 460212. Speech recognition |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
Byline Affiliations | Information Technology University, Pakistan |
Institute for Resilient Regions | |
National University of Sciences and Technology, Pakistan | |
University of New South Wales | |
Institution of Origin | University of Southern Queensland |
https://research.usq.edu.au/item/q50z3/transfer-learning-for-improving-speech-emotion-classification-accuracy
247
total views18
total downloads0
views this month0
downloads this month