Using machine learning based emulators for the sensitivity analysis of process-driven biophysical models
PhD Thesis
Title | Using machine learning based emulators for the sensitivity analysis of process-driven biophysical models |
---|---|
Type | PhD Thesis |
Authors | |
Author | Johnston, David B. |
Supervisor | |
1. First | Prof Keith Pembleton |
2. Second | Prof Ravinesh Deo |
3. Third | Neil I. Huth |
Institution of Origin | University of Southern Queensland |
Qualification Name | Doctor of Philosophy |
Number of Pages | 172 |
Year | 2022 |
Publisher | University of Southern Queensland |
Place of Publication | Australia |
Digital Object Identifier (DOI) | https://doi.org/10.26192/q7q78 |
Abstract | Sensitivity Analysis (SA) is a versatile and well-established tool used in the development and application of computer models. Although considered an integral part of the modelling process in multiple disciplines, its use in the development of process-driven biophysical models is relatively rare. One contributing reason for this lack of use is the computational burden associated with performing SA on complex models. Literature reports examples of the use of emulators, or metamodels, as an approach for reducing the computational burden of complex models, but there are no reports of using machine learning based emulators for undertaking SA of the underlying process-driven biophysical models. This doctoral thesis explores the potential of machine learning emulators (MLEs) in reducing the computational burden of performing SA on process-driven biophysical models. Firstly, a new method is developed that confirms that the variable importance indices of MLEs are comparable to the sensitivity indices produced by the commonly used Morris and Sobol methods. This provides the confidence upon which to proceed with investigating further the role that MLEs might play in reducing the computational burden of SA. Secondly, three different machine learning (ML) algorithms are used to generate MLEs of the APSIM-NextGen chickpea model to evaluate if some MLEs are better suited to the task of emulating process-driven biophysical crop models. The MLEs were assessed on accuracy of predicted values and the computational effort required to develop the MLEs themselves. The emulators based on random forest models were shown to produce the most accurate predictions, but also required the most computational effort to develop and train. Thirdly, two MLEs are used to undertake SA of all 22 input parameters of the MLEs, as well as a selected subset of six input parameters linked to the phenology of the crop. These analyses required more than 40 million simulations to be run. The MLEs were assessed based on their speed of execution, and on the Morris and Sobol indices produced. The impressive computational speed of the MLEs was quantified in comparison to the speed of the process-driven biophysical model. Some discrepancies were also noted between the results generated by the two types of MLE, so no firm conclusions could be made about the sensitivities of the underlying process-driven model. This work is at the juncture of the fields of process-driven biophysical model development, agronomy, plant physiology, machine learning emulators, and global sensitivity analysis. The outcomes of this work have implications for model development and model application in all these disciplines. Firstly, the Morris method remains a more computationally efficient choice, when compared with the development and use of MLEs, for the screening of importance of parameters of process-driven models. Secondly, the results show that, while both Morris and Sobol analyses produce very similar results across different MLEs, the discrepancies indicate that great caution is needed if interpreting these results as a way of understanding the underlying process-driven model and its input-output sensitivities. The results suggest that by using the computational efficiency of an MLE, SA of large-scale simulation experiments becomes more feasible, and this can contribute to efficiency gains for scientific research. The SA of enhanced forms of simulation experiments produced by hybrid models, which use the outputs of process-driven models and combine these with other sources of data to create new forms of ML based agro-ecological models, is suggested by this research as a direction that could be perused to advance agroecological modelling. This work has demonstrated how applied research in these areas, when combined, can better serve the needs of researchers and modelling practitioners alike. |
Keywords | Machine learning, emulator, sensitivity analysis, process-driven models, APSIM |
ANZSRC Field of Research 2020 | 300207. Agricultural systems analysis and modelling |
300205. Agricultural production systems simulation | |
460207. Modelling and simulation | |
461104. Neural networks | |
Public Notes | File reproduced in accordance with the copyright policy of the publisher/author. |
Byline Affiliations | School of Agriculture and Environmental Science |
https://research.usq.edu.au/item/q7q78/using-machine-learning-based-emulators-for-the-sensitivity-analysis-of-process-driven-biophysical-models
Download files
96
total views76
total downloads7
views this month7
downloads this month