Analysis and Multi-objective Protection of Public Medical Datasets from Privacy and Utility Perspectives
Article
Jahan, Samsad, Ge, Yong‑Feng, Kabir, Enamul and Wang, Kate. 2025. "Analysis and Multi-objective Protection of Public Medical Datasets from Privacy and Utility Perspectives." Data Science and Engineering. https://doi.org/10.1007/s41019-025-00283-0
Article Title | Analysis and Multi-objective Protection of Public Medical Datasets from Privacy and Utility Perspectives |
---|---|
ERA Journal ID | 212257 |
Article Category | Article |
Authors | Jahan, Samsad, Ge, Yong‑Feng, Kabir, Enamul and Wang, Kate |
Journal Title | Data Science and Engineering |
Number of Pages | 14 |
Year | 2025 |
Publisher | SpringerOpen |
Place of Publication | Germany |
ISSN | 2364-1185 |
2364-1541 | |
Digital Object Identifier (DOI) | https://doi.org/10.1007/s41019-025-00283-0 |
Web Address (URL) | https://link.springer.com/article/10.1007/s41019-025-00283-0 |
Abstract | In this era of big data, seamless distribution of healthcare information is crucial for improving patient care and advancing medical research, necessitating meticulous attention to preserving health data privacy. However, overly stringent protection measures can impede the efficient utilization of invaluable resources for medical research and personalized healthcare, posing a central challenge in balancing privacy protection with effective data utilization. This study aims to explore various methods used to protect the privacy of patients’ health records, and evaluates their advantages and limitations. Additionally, it conducts an in-depth analysis of a public medical dataset concerning privacy protection, assessing the effectiveness of k-anonymity and l-diversity privacy criteria and examining the influence of quasi-identifier (QID) attributes on privacy preservation. The study showcases techniques to achieve privacy standards, including generalization and suppression. Furthermore, it introduces a novel approach that utilizes the genetic algorithm (GA) and a non-dominated sorting technique to maximize both privacy and utility in health data through multi-objective optimization. After examining the results, this paper offers a guide for data owners on selecting attributes for medical data publication and choosing suitable privacy preservation strategies. Through the exploration of the GA and the non-dominated sorting approach, this paper suggests that the proposed GA can offer promising non-dominated solutions to the issue of health data privacy in the era of data-driven healthcare. A combination of these algorithms can enhance privacy protection and provide healthcare professionals and researchers with essential knowledge, ultimately benefiting patient care and ensuring a more secure database system. |
Keywords | Data privacy; k-anonymity; l-diversity; Genetic algorithm; Multi-objective optimization |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 460599. Data management and data science not elsewhere classified |
Byline Affiliations | Victoria University |
School of Mathematics, Physics and Computing | |
Royal Melbourne Institute of Technology (RMIT) |
Permalink -
https://research.usq.edu.au/item/zx13q/analysis-and-multi-objective-protection-of-public-medical-datasets-from-privacy-and-utility-perspectives
Download files
5
total views0
total downloads5
views this month0
downloads this month