A Comparison of LSTM and GRU for Bengali Speech-to-Text Transformation
Paper
Sathi, Nusrat, Shahid, Zakia, Chowdhury, Fahim, Ahmed, Sajjad, Parvez, Mohammad Zavid, Barua, Prabal Datta and Chakraborty, Subrata. 2023. "A Comparison of LSTM and GRU for Bengali Speech-to-Text Transformation." 2023 International Conference on Advances in Computing Research (ACR’23). Orlando, United States 08 - 10 May 2023 Switzerland. https://doi.org/10.1007/978-3-031-33743-7_18
Paper/Presentation Title | A Comparison of LSTM and GRU for Bengali Speech-to-Text Transformation |
---|---|
Presentation Type | Paper |
Authors | Sathi, Nusrat, Shahid, Zakia, Chowdhury, Fahim, Ahmed, Sajjad, Parvez, Mohammad Zavid, Barua, Prabal Datta and Chakraborty, Subrata |
Journal or Proceedings Title | Proceedings of the 2023 International Conference on Advances in Computing Research (ACR’23) |
Journal Citation | 700, pp. 214-224 |
Number of Pages | 11 |
Year | 2023 |
Place of Publication | Switzerland |
ISBN | 9783031337420 |
Digital Object Identifier (DOI) | https://doi.org/10.1007/978-3-031-33743-7_18 |
Web Address (URL) of Paper | https://link.springer.com/chapter/10.1007/978-3-031-33743-7_18 |
Web Address (URL) of Conference Proceedings | https://link.springer.com/book/10.1007/978-3-031-33743-7 |
Conference/Event | 2023 International Conference on Advances in Computing Research (ACR’23) |
Event Details | 2023 International Conference on Advances in Computing Research (ACR’23) Delivery In person Event Date 08 to end of 10 May 2023 Event Location Orlando, United States |
Abstract | This paper represents an approach to speech-to-text conversion in the Bengali language. In this area, we have found most of the methodologies were focused on other languages rather than Bengali. We started with a novel dataset of 56 unique words from 160 individual subjects was prepared. Then in this paper, we illustrate the approach to increasing accuracy in a speech-to-text over the Bengali language where initially we started with Gated Recurrent Unit(GRU) and Long short-term memory (LSTM) algorithms. During further observation, we found that the output of the GRU failed to give any stable output. So, we moved completely to the LSTM algorithm where we achieved 90% accuracy on an unexplored dataset. Voices of several demographic populations and noises were used to validate the model. In the testing phase, we tried a variety of classes based on their length, complexity, noise, and gender variant. Moreover, we expect that this research will help to develop a real-time Bengali speak-to-text recognition model. |
Keywords | Long-Short Term Memory; Natural Language Processing; Gated Recurrent Unit; Voice Recognition; Speech to Text |
ANZSRC Field of Research 2020 | 400306. Computational physiology |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
Series | Lecture Notes in Networks and Systems |
Byline Affiliations | BRAC University, Bangladesh |
Torrens University | |
Australian Catholic University | |
Charles Sturt University | |
Asia Pacific International College (APIC), Australia | |
School of Business | |
University of New England | |
Cogninet Australia, Australia |
Permalink -
https://research.usq.edu.au/item/z2766/a-comparison-of-lstm-and-gru-for-bengali-speech-to-text-transformation
54
total views3
total downloads3
views this month0
downloads this month