A Novel Policy for Pre-trained Deep Reinforcement Learning for Speech Emotion Recognition
Paper
Paper/Presentation Title | A Novel Policy for Pre-trained Deep Reinforcement Learning for Speech Emotion Recognition |
---|---|
Presentation Type | Paper |
Authors | Rajapakshe, Thejan (Author), Rana, Rajib (Author), Khalifa, Sara (Author), Liu, Jiajun (Author) and Schuller, Bjorn W (Author) |
Editors | Abramson, David and Dinh, Minh Ngoc |
Journal or Proceedings Title | Proceedings of the 2022 Australasian Computer Science Week (ACSW 2022) |
Number of Pages | 10 |
Year | 2022 |
Place of Publication | United States |
ISBN | 9781450396066 |
Digital Object Identifier (DOI) | https://doi.org/10.1145/3511616.3513104 |
Web Address (URL) of Paper | https://dl.acm.org/doi/10.1145/3511616.3513104 |
Web Address (URL) of Conference Proceedings | https://dl.acm.org/doi/proceedings/10.1145/3511616 |
Conference/Event | 2022 Australasian Computer Science Week (ACSW 2022) |
Event Details | 2022 Australasian Computer Science Week (ACSW 2022) Event Date 14 to end of 17 Feb 2022 Event Location Brisbane, Australia |
Abstract | Deep Reinforcement Learning (deep RL) has gained tremendous success in gaming but it has rarely been explored for Speech Emotion Recognition (SER). In the RL literature, policy used by the RL agent plays a major role in action selection, however, there is no RL policy tailored for SER. Also, an extended learning period is a general challenge for deep RL, which can impact the speed of learning for SER. In this paper, we introduce a novel policy, the 'Zeta policy' tailored for SER and introduce pre-training in deep RL to achieve a faster learning rate. Pre-training with a cross dataset was also studied to discover the feasibility of pre-training the RL agent with a similar dataset in a scenario where real environmental data is not available. We use 'IEMOCAP' and 'SAVEE' datasets for the evaluation with the problem of recognising four emotions, namely happy, sad, angry, and neutral. The experimental results show that the proposed policy performs better than existing policies. Results also support that pre-training can reduce training time and is robust to a cross-corpus scenario. |
Keywords | Speech Emotion Recognition, Reinforcement Learning, Deep Learning, Deep Reinforcement Learning, Machine Learning |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 460212. Speech recognition |
461105. Reinforcement learning | |
461103. Deep learning | |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
Byline Affiliations | School of Mathematics, Physics and Computing |
Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia | |
Imperial College London, United Kingdom | |
Institution of Origin | University of Southern Queensland |
https://research.usq.edu.au/item/q7336/a-novel-policy-for-pre-trained-deep-reinforcement-learning-for-speech-emotion-recognition
97
total views6
total downloads1
views this month0
downloads this month