Nonparametric Bootstrap Likelihood Estimation to Investigate the Chance Set-up on Clustering Results
Article
Article Title | Nonparametric Bootstrap Likelihood Estimation to Investigate the Chance Set-up on Clustering Results |
---|---|
Article Category | Article |
Authors | Elnour, Ammar, Yang, Wencheng and Li, Yan |
Journal Title | IEEE Open Journal of the Computer Society |
Number of Pages | 12 |
Year | 2025 |
Publisher | IEEE (Institute of Electrical and Electronics Engineers) |
Place of Publication | United States |
ISSN | 2644-1268 |
2644-125X | |
Digital Object Identifier (DOI) | https://doi.org/10.1109/OJCS.2025.3545261 |
Web Address (URL) | https://ieeexplore.ieee.org/abstract/document/10902121 |
Abstract | Clustering algorithms are widely used in the knowledge discovery domain, but concerns and questions about the validity of the results must be considered. The datasets commonly used for clustering tasks are often large and scale-free, making conventional statistical techniques inadequate for analyzing result uncertainty. This issue applies to most outcomes obtained from other knowledge discovery techniques, such as machine learning and statistical learning. Traditional statistical methods assume data follows standard distributions, whereas resampling and bootstrapping methods offer more accurate and reliable alternatives. This article introduces a method that employs bootstrap likelihood estimation to infer the uncertainty of generated clustering structures. We first calculated the clustering error in the original dataset and then utilized the proposed method to estimate its nonparametric bootstrapped likelihood. By comparing these two values, we can establish a nonparametric significance testing framework that directly determines the validity of the result. To evaluate the effectiveness of our method, we conducted experiments using synthetic and real datasets. The results demonstrate that our method can successfully validate clustering results. |
Keywords | bootstrap likelihood; statistical machine learning; Clustering; validity testing; randomness; significance testing |
Article Publishing Charge (APC) Funding | Researcher |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 460502. Data mining and knowledge discovery |
Byline Affiliations | School of Mathematics, Physics and Computing |
https://research.usq.edu.au/item/zwx44/nonparametric-bootstrap-likelihood-estimation-to-investigate-the-chance-set-up-on-clustering-results
Download files
Published Version
Nonparametric_Bootstrap_Likelihood_Estimation_to_Investigate_the_Chance_Set-Up_on_Clustering_Results.pdf | ||
License: CC BY 4.0 | ||
File access level: Anyone |
1
total views1
total downloads1
views this month1
downloads this month