An accurate automated speaker counting architecture based on James Webb Pattern
Article
Article Title | An accurate automated speaker counting architecture based on James Webb Pattern |
---|---|
ERA Journal ID | 32032 |
Article Category | Article |
Authors | Barua, Prabal Datta, Yildiz, Arif Metehan, Canpolat, Nida, Keles, Tugce, Dogan, Sengul, Baygin, Mehmet, Tuncer, Ilknur, Tuncer, Turker, Tan, Ru-San, Fujita, Hamido and Acharya, U. Rajendra |
Journal Title | Engineering Applications of Artificial Intelligence |
Journal Citation | 119 |
Article Number | 105821 |
Number of Pages | 12 |
Year | 2023 |
Publisher | Elsevier |
Place of Publication | United Kingdom |
ISSN | 0952-1976 |
1873-6769 | |
Digital Object Identifier (DOI) | https://doi.org/10.1016/j.engappai.2023.105821 |
Web Address (URL) | https://www.sciencedirect.com/science/article/pii/S0952197623000052 |
Abstract | Speaker counting is an important research area in sound forensics. There are limited speaker counting papers in the literature, as it is challenging to collect datasets. This work aims to collect a new overlapping speech signal dataset for speaker counting and propose a novel feature engineering model. In this work, textural feature extraction is based on the iconic James Webb space telescope; hence, this pattern is named James Webb Pattern (JWPat). A new speaker counting speech dataset comprising 3,121 speeches divided into 32 classes (the class number corresponded to the number of speakers) was collected. A new framework that mimics the deep learning model has been proposed to classify the collected speech classes. The proposed feature engineering model is self-organized and uses various mother wavelet functions to generate features at both low and high levels. We have obtained the best classification accuracy of 86.74% using the symlet4 mother wavelet function. Using our proposed framework, eight classification results have been calculated with accuracy ranging from 75.94% to 86.74%. This range is over 10% accuracy, and it demonstrates the effect of the mother wavelet function on the classification performance. Moreover, the feature extraction capability of the mirror of the James Webb telescope has been demonstrated. Our proposed method yielded 86.74% accuracy on a large dataset and indicated the success of our proposed model. |
Keywords | Iterative neighborhood component analysis; James Webb pattern; Sound forensics; Speaker counting; Unbalanced tree discrete wavelet transform |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 400306. Computational physiology |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
Funder | Firat Üniversitesi |
Byline Affiliations | Ngee Ann Polytechnic, Singapore |
Singapore University of Social Sciences (SUSS), Singapore | |
Asia University, Taiwan | |
School of Business | |
University of Technology Sydney | |
Firat University, Turkey | |
Ardahan University, Turkiye | |
Government office in Elazig, Turkiye | |
National Heart Centre, Singapore | |
Duke-NUS Medical School, Singapore | |
HUTECH University of Technology, Vietnam | |
University of Granada, Spain | |
Iwate Prefectural University, Japan |
https://research.usq.edu.au/item/yyw60/an-accurate-automated-speaker-counting-architecture-based-on-james-webb-pattern
84
total views1
total downloads4
views this month0
downloads this month