Association between work-related features and coronary artery disease: a heterogeneous hybrid feature selection integrated with balancing approach
Article
Article Title | Association between work-related features and coronary artery disease: a heterogeneous hybrid feature selection integrated with balancing approach |
---|---|
ERA Journal ID | 18106 |
Article Category | Article |
Authors | Nasarian, Elham (Author), Abdar, Moloud (Author), Fahami, Mohammad Amin (Author), Alizadehsani, Roohallah (Author), Hussain, Sadiq (Author), Basiri, Mohammad Ehsan (Author), Zomorodi-Moghadam, Mariam (Author), Zhou, Xujuan (Author), Plawiak, Pawel (Author), Acharya, U. Rajendra (Author), Tan, Ru-San (Author) and Sarrafzadegan, Nizal (Author) |
Journal Title | Pattern Recognition Letters |
Journal Citation | 133, pp. 33-40 |
Number of Pages | 8 |
Year | 2020 |
Publisher | Elsevier |
Place of Publication | Netherlands |
ISSN | 0167-8655 |
1872-7344 | |
Digital Object Identifier (DOI) | https://doi.org/10.1016/j.patrec.2020.02.010 |
Web Address (URL) | https://www.sciencedirect.com/science/article/pii/S0167865520300507 |
Abstract | Coronary artery disease (CAD) is a leading cause of death worldwide and is associated with high healthcare expenditure. Researchers are motivated to apply machine learning (ML) for quick and accurate detection of CAD. The performance of the automated systems depends on the quality of features used. Clinical CAD datasets contain different features with varying degrees of association with CAD. To extract such features, we developed a novel hybrid feature selection algorithm called heterogeneous hybrid feature selection (2HFS). In this work, we used Nasarian CAD dataset, in which work place and environmental features are also considered, in addition to other clinical features. Synthetic minority over-sampling technique (SMOTE) and Adaptive synthetic (ADASYN) are used to handle the imbalance in the dataset. Decision tree (DT), Gaussian Naive Bayes (GNB), Random Forest (RF), and XGBoost classifiers are used. 2HFS-selected features are then input into these classifier algorithms. Our results show that, the proposed feature selection method has yielded the classification accuracy of 81.23% with SMOTE and XGBoost classifier. We have also tested our approach with other well-known CAD datasets: Hungarian dataset, Long-beach-va dataset, and Z-Alizadeh Sani dataset. We have obtained 83.94%, 81.58% and 92.58% for Hungarian dataset, Long-beach va dataset, and Z-Alizadeh Sani dataset, respectively. Hence, our experimental results confirm the effectiveness of our proposed feature selection algorithm as compared to the existing state-of-the-art techniques which yielded outstanding results for the development of automated CAD systems. |
Keywords | machine learning; data mining; heart disease; coronary artery disease; feature selection |
ANZSRC Field of Research 2020 | 469999. Other information and computing sciences not elsewhere classified |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
Institution of Origin | University of Southern Queensland |
Byline Affiliations | Islamic Azad University, Iran |
Deakin University | |
Isfahan University of Technology, Iran | |
Dibrugarh University, India | |
Shahrekord University, Iran | |
Ferdowsi University of Mashhad, Iran | |
School of Management and Enterprise | |
Cracow University of Technology, Poland | |
Ngee Ann Polytechnic, Singapore | |
Asia University, Taiwan | |
Singapore University of Social Sciences (SUSS), Singapore | |
National Heart Centre, Singapore | |
University of British Columbia, Canada |
https://research.usq.edu.au/item/q5qq3/association-between-work-related-features-and-coronary-artery-disease-a-heterogeneous-hybrid-feature-selection-integrated-with-balancing-approach
188
total views8
total downloads0
views this month0
downloads this month