Land-based sources of dissolved inorganic nitrogen (DIN) impact the health and resilience of the Australian Great Barrier Reef, and so quantification of the sources is important to inform prioritisation of investments aimed at improving reef health [Reference Baker3, Reference Furnas6, 7]. To increase certainty in the land use decisions informed by the estimates, ongoing improvement to methods used to quantify DIN from ungauged catchments is needed [Reference Baird, Mongin, Skerratt, Margvelashvili, Tickell, Steven, Robillot, Ellis, Waters, Kaaniewska and Brodie2, Reference Creighton, Waterhouse, Day and Brodie4]. Catchment scale water quality models are the primary tools used to quantify the influence of landscapes towards receiving waters and are effective for communication of the influences of landscape management [Reference Baker3, Reference Fu, Merritt, Croke, Weber and Jakeman5]. Design and development of those models rely on extensive observed water quality data for development and calibration, however, the collection of the data is both expensive and not possible in all areas [Reference McCloskey, Waters, Syme, MacDonald, Fulton and Piantadosi12]. While machine learning can offer new approaches, particularly for nonlinear relationships between the water quality and its drivers, its application to DIN in ungauged areas has not previously been demonstrated [Reference Ighalo, Adeniyi and Marques9, Reference Tung and Yaseen18].
Research undertaken by O’Sullivan [Reference O’Sullivan14] developed new knowledge to overcome those data voids that afflict water quality modelling for simulating DIN from ungauged catchments. The research coupled catchment classification, a method demonstrated to overcome data voids for the linear relationship between flows and landscape features [Reference Hrachowitz, Savenije, Blöschl, McDonnell, Sivapalan, Pomeroy and Ehret8], with pattern matching to corroborate catchments that share nonlinear relationships between both DIN patterns and spatial data [Reference Liu, Ryu, Webb, Lintern, Guo, Waters and Western11, Reference O’Sullivan, Ghahramani, Deo, Pembleton, Khan and Tuteja16]. The research, for the first time, used spatial datasets for original vegetation [Reference Neldner, Niehus, Wilson, McDonald, Ford and Accad13], as a proxy dataset to the drivers of DIN. In particular, the research identified datasets, artificial neural networks and explainable artificial intelligence evaluation methods to expose nonlinear and inconsistent patterns in datasets. The research demonstrated that mapped original vegetation data represents the natural variability in biological response to the drivers of heterogeneity in DIN patterns across the landscape [Reference O’Sullivan, Deo and Ghahramani15, Reference O’Sullivan, Ghahramani, Deo and Pembleton17]. Explainable artificial intelligence approaches identified original vegetation variables most influential in the classification results. This provided a method to categorise water quality patterns as they corroborate with the spatial data, that is, for vineforest, woodland or forest-dominated catchments [Reference O’Sullivan, Ghahramani, Deo and Pembleton17]. Application of this process knowledge of seasonal and flow drivers facilitated classification of ungauged catchments of the Great Barrier Reef using the spatial data as a proxy for absence of observed DIN data.
Development of the classification methods and training data composition tailored to the nonlinear relationship of DIN to its drivers ultimately facilitated satisfactory simulation of DIN for a pseudo-ungauged catchment [Reference O’Sullivan, Deo and Ghahramani15]. The case study trial involved development of an ANN-WQ simulator trained using spatial data for the gauged catchments to predict DIN, and then tested in the unsupervised environment to predict DIN for a classified pseudo-ungauged catchment, using corresponding spatial data only. The research demonstrates that water quality simulation model performance improves where the model is designed to recognise the temporal scale relevant for the classified catchment (
$p<0.05$
). This finding is consistent with other research that found neural network performance improves where training data are refined [Reference Alshemali and Kalita1, Reference Kavzoglu10]. These findings demonstrate the importance of customised training methods to overcome nonlinearity and heterogeneity in dataset patterns to improve simulation capacity for DIN. Finally, the research identified catchments that lack spatial data similarity to gauged catchments and are likely unsuitable to classify with currently gauged catchments, so need prioritisation for future gauging and water quality monitoring programs. In summary, the research provides justification for classification of catchments based on likely drivers of DIN. The research also provides a method to identify where investments for additional observation data are necessary to further improve certainty in catchment scale water quality model simulations of DIN for all ungauged catchments that flow to the Great Barrier Reef.
Acknowledgements
The author thanks her supervisors, Professor Ravinesh Deo, Professor Keith Pembleton and Dr Afshin Ghahramani, collaboration between the University of Southern Queensland, Queensland Department of Environment and Science, Queensland Water Modelling Network, Bureau of Meteorology and Australian Government Research Training Program Scholarship for facilitating this research.