Hamlet-Pattern-Based Automated COVID-19 and Influenza Detection Model Using Protein Sequences
Article
Article Title | Hamlet-Pattern-Based Automated COVID-19 and Influenza Detection Model Using Protein Sequences |
---|---|
ERA Journal ID | 212275 |
Article Category | Article |
Authors | Erten, Mehmet, Acharya, Madhav R., Kamath, Aditya P., Sampathila, Niranjana, Bairy, G. Muralidhar, Aydemir, Emrah, Barua, Prabal Datta, Baygin, Mehmet, Tuncer, Ilknur, Dogan, Sengul and Tuncer, Turker |
Journal Title | Diagnostics |
Journal Citation | 12 (12) |
Article Number | 3181 |
Number of Pages | 15 |
Year | 2022 |
Publisher | MDPI AG |
Place of Publication | Switzerland |
ISSN | 2075-4418 |
Digital Object Identifier (DOI) | https://doi.org/10.3390/diagnostics12123181 |
Web Address (URL) | https://www.mdpi.com/2075-4418/12/12/3181 |
Abstract | SARS-CoV-2 and Influenza-A can present similar symptoms. Computer-aided diagnosis can help facilitate screening for the two conditions, and may be especially relevant and useful in the current COVID-19 pandemic because seasonal Influenza-A infection can still occur. We have developed a novel text-based classification model for discriminating between the two conditions using protein sequences of varying lengths. We downloaded viral protein sequences of SARS-CoV-2 and Influenza-A with varying lengths (all 100 or greater) from the NCBI database and randomly selected 16,901 SARS-CoV-2 and 19,523 Influenza-A sequences to form a two-class study dataset. We used a new feature extraction function based on a unique pattern, HamletPat, generated from the text of Shakespeare’s Hamlet, and a signum function to extract local binary pattern-like bits from overlapping fixed-length (27) blocks of the protein sequences. The bits were converted to decimal map signals from which histograms were extracted and concatenated to form a final feature vector of length 1280. The iterative Chi-square function selected the 340 most discriminative features to feed to an SVM with a Gaussian kernel for classification. The model attained 99.92% and 99.87% classification accuracy rates using hold-out (75:25 split ratio) and five-fold cross-validations, respectively. The excellent performance of the lightweight, handcrafted HamletPat-based classification model suggests that it can be a valuable tool for screening protein sequences to discriminate between SARS-CoV-2 and Influenza-A infections. |
Keywords | bioinformatics; Hamlet Pattern; protein sequence classification; SARS-CoV-2 |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 3206. Medical biotechnology |
Byline Affiliations | Malatya Training and Research Hospital, Turkiye |
Manipal Academy of Higher Education, India | |
Brown University, United States | |
Sakarya University, Turkey | |
School of Business | |
University of Technology Sydney | |
Ardahan University, Turkiye | |
Government office in Elazig, Turkiye | |
Firat University, Turkey |
https://research.usq.edu.au/item/yywq4/hamlet-pattern-based-automated-covid-19-and-influenza-detection-model-using-protein-sequences
Download files
71
total views27
total downloads5
views this month0
downloads this month
Export as
Related outputs
Automated hip dysplasia detection using novel FlexiLBPHOG model with ultrasound images
Key, Sefa, Kurum, Huseyin, Esmez, Omer, Hafeez-Baig, Abdul, Hajiyeva, Rena, Dogan, Sengul and Tuncer, Turker. 2025. "Automated hip dysplasia detection using novel FlexiLBPHOG model with ultrasound images." Ain Shams Engineering Journal. 16 (1). https://doi.org/10.1016/j.asej.2024.103235Artificial Intelligence-Based Suicide Prevention and Prediction: A Systematic Review (2019-2023)
Atmakuru, Anirudh, Shahini, Alen, Chakraborty, Subrata, Seoni, Silvia, Salvi, Massimo, Hafeez-Baig, Abdul, Rashid, Sadaf, Tan, Ru San, Barua, Prabal Datta, Molinari, Filippo and Acharya, U Rajendra. 2025. "Artificial Intelligence-Based Suicide Prevention and Prediction: A Systematic Review (2019-2023)." Information Fusion. 114. https://doi.org/10.1016/j.inffus.2024.102673Directed Lobish-based explainable feature engineering model with TTPat and CWINCA for EEG artifact classification
Tuncer, Turker, Dogan, Sengul, Baygin, Mehmet, Tasci, Irem, Mungen, Bulent, Tasci, Burak, Barua, Prabal Datta and Acharya, U.R.. 2024. "Directed Lobish-based explainable feature engineering model with TTPat and CWINCA for EEG artifact classification." Knowledge-Based Systems. 305. https://doi.org/10.1016/j.knosys.2024.112555Retinal Health Screening Using Artificial Intelligence with Digital Fundus Images: A Review of the Last Decade (2012-2023)
Islam, Saad, Deo, Ravinesh C., Barua, Prabal Datta, Soar, Jeffrey, Yu, Ping and Acharya, U. Rajendra. 2024. "Retinal Health Screening Using Artificial Intelligence with Digital Fundus Images: A Review of the Last Decade (2012-2023)." IEEE Access. 12, pp. 176630-176685. https://doi.org/10.1109/ACCESS.2024.3477420Automated EEG-based language detection using directed quantum pattern technique
Dogan, Sengul, Tuncer, Turker, Barua, Prabal Datta and Acharya, U.R.. 2024. "Automated EEG-based language detection using directed quantum pattern technique." Applied Soft Computing. 167 (Part A). https://doi.org/10.1016/j.asoc.2024.112301