An evaluation of the performance of a NoSQL document database in a simulation of a large scale Electronic Health Record (EHR) system
PhD Thesis
Title | An evaluation of the performance of a NoSQL document database in a simulation of a large scale Electronic Health Record (EHR) system |
---|---|
Type | PhD Thesis |
Authors | |
Author | Ercan, Mehmet Zahid |
Supervisor | Lane, Michael |
Gururajan, Raj | |
Institution of Origin | University of Southern Queensland |
Qualification Name | Doctor of Philosophy |
Number of Pages | 228 |
Year | 2017 |
Digital Object Identifier (DOI) | https://doi.org/10.26192/5c09b7aff0cc4 |
Abstract | Electronic Healthcare Record (EHR) systems can provide significant benefits by improving the effectiveness of healthcare systems. Research and industry projects focusing on storing healthcare information in NoSQL databases has been triggered by practical experience demonstrating that a relational database approach to managing healthcare records has become a bottleneck. Previous studies show that NoSQL databases based on consistency, availability and partition tolerance (CAP) theorem have significant advantages over relational databases such as easy and automatic scaling, better performance and high availability. However, there is limited empirical research that has evaluated the suitability of NoSQL databases for managing EHRs. This research addressed this identified research problem and gap in the literature by investigating the following general research: How can a simulation of a large EHR system be developed so that the performance of NoSQL document databases comparative to relational databases can be evaluated? Using a Design Science approach informed by a pragmatic worldview, a number of IT artefacts were developed to enable an evaluation of performance of a NoSQL document oriented database comparative to a relational database in a simulation of a large scale EHR system. These were healthcare data models (NoSQL document database, relational database) for the Australian Healthcare context, a random healthcare data generator and a prototype EHR system. The performance of a NoSQL document database (Couchbase) was evaluated comparative to a relational database (MySQL) in terms database operations (insert, update, delete of EHRs), scalability, EHR sharing and data analysis (complex querying) capabilities in a simulation of a large scale EHR system, constructed in the cloud environment of Amazon Web Services (AWS). Test scenarios consisted of a number of different configurations ranging from 1, 2, 4, 8 and 16 nodes for 1Million, 10 Million, 100 Million and 500 Million records to simulate database operations in a large scale and distributed EHR system environment. The Couchbase NoSQL document database was found to perform significantly better than the MySQL relational database in most of the test cases in terms of database operations -insert, update, delete of EHRs, scalability and EHR sharing. However, the MySQL relational database was found to perform significantly better than the Couchbase NoSQL document database for the complex query test that demonstrates basic analysis capabilities. Furthermore, the Couchbase NoSQL document database used significantly more disk space than the MySQL relational database to store the same number of EHRs. This research made a number of important contributions to knowledge, theory and practice. The main theoretical contribution to design theory was the design and evaluation of a prototype EHR system for simulating database management operations in a large scale EHR system environment. The prototype EHR system was underpinned by the development of two data models with data structures designed for a NoSQL document database and a relational database and a random healthcare data generator which were based on Australian Healthcare data characteristics and statistics. The design of a data model for EHRs for a NoSQL document database using an aggregated document modelling approach provided an important contribution to data modelling theory for NoSQL document databases using de-normalisation and document aggregation. The design of a random healthcare data generator was another important contribution to design theory and was based on a data distribution algorithm (multinomial distribution and probability theory) informed by National Health Data |
Keywords | NoSQL databases, Electronic Health Record (EHR), healthcare systems, relational databases, distributed systems, ACID, CAP theorem, BASE |
ANZSRC Field of Research 2020 | 460699. Distributed computing and systems software not elsewhere classified |
460599. Data management and data science not elsewhere classified | |
420399. Health services and systems not elsewhere classified | |
Byline Affiliations | School of Management and Enterprise |
https://research.usq.edu.au/item/q4w42/an-evaluation-of-the-performance-of-a-nosql-document-database-in-a-simulation-of-a-large-scale-electronic-health-record-ehr-system
Download files
685
total views289
total downloads2
views this month1
downloads this month