Presentation type:
ESSI – Earth & Space Science Informatics

EGU24-2658 | ECS | Orals | MAL17-ESSI | ESSI Division Outstanding Early Career Scientist Award Lecture

X-informatics at the center of scientific discovery: Detecting biosignatures, predicting mineral occurrences, and characterizing planetary kinds.  

Anirudh Prabhu, Shaunna Morrison, Robert Hazen, Michael L. Wong, Grethe Hystad, Henderson J. Cleaves II, Ahmed Eleish, George Cody, Vasundhara Gatne, Jose P Chavez, Xiaogang Ma, and Peter Fox and the Mineral Informatics Team

Data Science and Informatics methods have been at the center of many recent scientific discoveries and have opened up new frontiers in many areas of scientific inquiry. In this talk, I will take you through some of the most recent and exciting discoveries we've made and how informatics methods planned a central role in these discoveries. 

First, we will look at our work on data-driven biosignature detection, specifically how we combine pyrolysis-gas chromatography-mass spectrometry and machine learning to build an agnostic molecular biosignature detection model. 

Next, we will talk about how we used association analysis to predict the locations of as-yet-unknown mineral deposits on Earth and potentially Mars. These advances hold the potential to unlock new avenues of economic growth and sustainable development.

Finally, we will set our sights on exoplanets—celestial bodies orbiting distant stars. The discovery of thousands of exoplanets in recent years has fueled the quest to understand their formation, composition, and potential habitability. We develop informatics approaches to better understand, classify and predict the occurrence of exoplanets by embracing the complexity and multidimensionality of exoplanets and their host stars.

How to cite: Prabhu, A., Morrison, S., Hazen, R., Wong, M. L., Hystad, G., Cleaves II, H. J., Eleish, A., Cody, G., Gatne, V., Chavez, J. P., Ma, X., and Fox, P. and the Mineral Informatics Team: X-informatics at the center of scientific discovery: Detecting biosignatures, predicting mineral occurrences, and characterizing planetary kinds. , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2658,, 2024.

EGU24-4562 | Orals | MAL17-ESSI | Ian McHarg Medal Lecture

The central role of geoscience data standards in generating new knowledge 

Francois Robida

The earth sciences are first and foremost observational sciences, based on data collected on the planet over generations. It is from this data that interpretations, concepts and models are produced. The description of data by those who produced it and its conservation has always been a concern for scientists, to enable it to be reused and reproduced. This has been achieved by adopting common rules and standards, for example for indicating the geographical coordinates of an observation or the units of measurement used.

Today's scientific challenges, first and foremost the climate challenge, require the mobilisation of different scientific disciplines, often with different languages and practices.

The establishment of data infrastructures on an international scale means that researchers can use computer protocols to access considerable sources of data from their own and other disciplines. Digital tools such as AI make it possible to make machines 'reason' about data to produce new knowledge.

All these factors make it critical for both humans and machines to be able to 'understand' the data used. This understanding necessarily requires the adoption of common reference systems on an international scale and across all disciplines. These standards are based on a common 'vision' produced by the scientific community (and updated as knowledge evolves), resulting in vocabularies and ontologies shared by the community.

This presentation will look at the ecosystem for producing and maintaining standards for the geosciences and some of the issues involved in the relationship between scientists and the production and use of standards.

How to cite: Robida, F.: The central role of geoscience data standards in generating new knowledge, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4562,, 2024.

ESSI1 – Next-Generation Analytics for Scientific Discovery: Data Science, Machine Learning, AI

EGU24-1651 | ECS | Orals | ESSI1.1

AtmoRep: large scale representation learning for atmospheric dynamics 

Ilaria Luise, Christian Lessig, Martin Schultz, and Michael Langguth

The atmosphere affects humans in a multitude of ways, from loss of lives due to adverse weather effects to long-term social and economic impacts. Very recently, AI-based models have shown tremendous potential in reducing the computational costs for numerical weather prediction. However, they lack the versatility of conventional models. The team has therefore recently introduced AtmoRep, a first probabilistic foundation model of atmospheric dynamics for multi-purpose applications [Lessig 2023].  Through large-scale representation learning, AtmoRep encapsulates a general description of the atmosphere dynamics based on the ERA5 reanalysis. Following the principles of in-context learning from natural language processing, adapted here to Earth system science, domain applications like e.g. forecasting and downscaling can be performed without any task-specific training. The model has therefore been applied as the backbone for several tasks, from weather forecasting to downscaling, spatio-temporal interpolations and data driven precipitation forecasting. After fine-tuning AtmoRep achieves skill competitive with Pangu-Weather [Bi 2023] for short-term forecasting and substantially exceeds the AI-based competitor [Stengel 2021] for downscaling. 


The model has been conceived as a flexible stack of Transformers, one for each field, coupled through cross attention to ensure a plug-and-play architecture and allow the dynamical integration of new fields without the need of retraining from scratch. The main innovation consists of a newly developed statistical loss, which generalises from the concept of cross-entropy in classification problems. The model is therefore fully probabilistic, and each application comes with a well calibrated set of ensemble members with spread correlated to the variability of the system, as demonstrated for e.g. in forecasting by inspecting the CRPS score or the error to spread ratios (see [Lessig 2023]). 


In addition, the flexible nature of the model allows to perform model fine-tuning on different data-types. To demonstrate that, the precipitation forecasting skill of AtmoRep has been fine-tuned on real radar data using the Radklim dataset as a proxy for accurate total precipitation rates. Using Radklim as ground truth, the diagnostic scores e.g. the RMSE or the FBI (Frequency Bias Indicator), indicate univocally that after fine-tuning the AtmoRep model outperforms ERA5, both in terms of accuracy in spatial coverage and intensity. 


In terms of future plans, we are currently working to extend the model to longer term weather forecasts, up to medium range forecasting. Furthermore, we are integrating the downscaling and forecasting steps using the CERRA 5km resolution reanalysis over Europe, so to achieve multi-resolution coarse-to-fine predictions beyond quarter degree resolution in the next few months. 

AtmoRep represents a step forward in the direction of building solid and skilful multi-purpose approaches and the present work is, in our opinion, only a first step towards the possibilities that are enabled by the methodology.


[Lessig 2023] Lessig et. al. AtmoRep: A stochastic model of atmosphere dynamics using large scale representation learning. arXiv:2308.13280, 2023.

[Bi 2023] K. Bi et al., “Accurate medium-range global weather forecasting with 3d neural networks,” Nature, 2023.

[Stengel 2021] K. Stengel et al., “Adversarial super-resolution of climatological wind and solar data,” Proceed- ings of the National Academy of Sciences, vol. 117, 2020.

How to cite: Luise, I., Lessig, C., Schultz, M., and Langguth, M.: AtmoRep: large scale representation learning for atmospheric dynamics, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1651,, 2024.

EGU24-1760 | ECS | Orals | ESSI1.1

EarthPT: a foundation model for Earth Observation 

Michael Smith, Luke Fleming, and James Geach

 We introduce EarthPT -- an Earth Observation (EO) pretrained transformer. EarthPT is a 700 million parameter decoding transformer foundation model trained in an autoregressive self-supervised manner and developed specifically with EO use-cases in mind. We demonstrate that EarthPT is an effective forecaster that can accurately predict future pixel-level surface reflectances across the 400-2300 nm range well into the future. For example, forecasts of the evolution of the Normalised Difference Vegetation Index (NDVI) have a typical error of approximately 0.05 (over a natural range of -1 -> 1) at the pixel level over a five month test set horizon, out-performing simple phase-folded models based on historical averaging. We also demonstrate that embeddings learnt by EarthPT hold semantically meaningful information and could be exploited for downstream tasks such as highly granular, dynamic land use classification. Excitingly, we note that the abundance of EO data provides us with -- in theory -- quadrillions of training tokens. Therefore, if we assume that EarthPT follows neural scaling laws akin to those derived for Large Language Models (LLMs), there is currently no data-imposed limit to scaling EarthPT and other similar `Large Observation Models.'

EarthPT is available under the MIT licence.

How to cite: Smith, M., Fleming, L., and Geach, J.: EarthPT: a foundation model for Earth Observation, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1760,, 2024.

EGU24-3202 | Orals | ESSI1.1 | Highlight

Foundation Models for Science: Potential, Challenges, and the Path Forward 

Manil Maskey, Rahul Ramachandran, Tsengdar Lee, Kevin Murphy, Sujit Roy, Muthukumaran Ramasubramanian, Iksha Gurung, and Raghu Ganti

Foundation models signify a significant shift in AI by creating large-scale machine learning models (FMs) pre-trained on wide-ranging datasets. These models act as flexible starting points, ready to be fine-tuned for various specialized tasks. Distinct from traditional models designed for narrow objectives, foundation models apply their broad pre-training to learn patterns across data, enhancing their adaptability and efficiency in diverse domains. This approach minimizes the necessity for extensive, task-specific labeled datasets and prolonged training periods. A single foundation model can be tailored for many scientific applications, often outperforming traditional models in some tasks, even when labeled data is scarce.


Addressing the right array of complex scientific challenges using AI FMs requires interdisciplinary teams from various groups and organizations. No single research group or institution can independently muster the necessary resources or expertise to construct useful AI FMs. Thus, collaborative efforts are essential, combining diverse skills, resources, and viewpoints to create more comprehensive solutions. The right blend of domain-specific expertise and a broad understanding of various AI subfields is crucial to ensure the versatility and adaptability of foundation models. Moreover, the scientific community must develop a wide array of use cases, labeled datasets, and benchmarks to evaluate these models effectively across different scenarios to be accepted and widely utilized within science.


Building Foundation Models for science demands fostering collaboration among a diverse spectrum of research groups to ensure this broad range of perspectives. This strategy should include stakeholders like individual researchers, academic and government institutions, and tech companies. Embedding this collaboration within the principles of open science is therefore vital. Open science calls for transparent research, open sharing of findings, promoting reproducibility by making methodologies and data accessible, and providing tools researchers can freely use, modify, and distribute. Encouraging community collaboration in the model pre-training development leads to more robust and functional FM. Guaranteeing open access to datasets, models, and fine-tuning code enables researchers to validate findings and build upon previous work, thus reducing redundancy in data collection and cultivating a culture of shared knowledge and progress.

How to cite: Maskey, M., Ramachandran, R., Lee, T., Murphy, K., Roy, S., Ramasubramanian, M., Gurung, I., and Ganti, R.: Foundation Models for Science: Potential, Challenges, and the Path Forward, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-3202,, 2024.

EGU24-5769 | Orals | ESSI1.1

Exploring Transfer Learning Using Segment Anything Model in Optical Remote Sensing 

Mohanad Albughdadi, Vasileios Baousis, Tolga Kaprol, Armagan Karatosun, and Claudio Pisa

In the realm of remote sensing, where labeled datasets are scarce, leveraging pre-trained models via transfer learning offers a compelling solution. This study investigates the efficacy of the Segment Anything Model (SAM), a foundational computer vision model, in the domain of optical remote sensing tasks, specifically focusing on image classification and semantic segmentation.

The scarcity of labeled data in remote sensing poses a significant challenge for machine learning development. Transfer learning, a technique utilizing pre-trained models like SAM, circumvents this challenge by leveraging existing data from related domains. SAM, developed and trained by Meta AI, serves as a foundational model for prompt-based image segmentation. It employs over 1 billion masks on 11 million images, facilitating robust zero-shot and few-shot capabilities. SAM's architecture comprises an image encoder, prompt encoder, and mask decoder components, all geared towards swift and accurate segmentation for various prompts, ensuring real-time interactivity and handling ambiguity.

Two distinct use cases leveraging SAM-based models in the domain of optical remote sensing are presented, representing two critical tasks: image classification and semantic segmentation. Through comprehensive analysis and comparative assessments, various model architectures, including linear and convolutional classifiers, SAM-based adaptations, and UNet for semantic segmentation, are examined. Experiments encompass contrasting model performances across different dataset splits and varying training data sizes. The SAM-based models include using a linear, a convolutional or a ViT decoder classifiers on top of the SAM encoder.

Use Case 1: Image Classification with EuroSAT Dataset

The EuroSAT dataset, comprising 27,000 labeled image patches from Sentinel-2 satellite images across ten distinct land cover classes, serves as the testing ground for image classification tasks. SAM-ViT models consistently demonstrate high accuracy, ranging between 89% and 93% on various sizes of training datasets. These models outperform baseline approaches, exhibiting resilience even with limited training data. This use case highlights SAM-ViT's effectiveness in accurately categorizing land cover classes despite data limitations.

Use Case 2: Semantic Segmentation with Road Dataset

In the semantic segmentation domain, the study focuses on the Road dataset, evaluating SAM-based models, particularly SAM-CONV, against the benchmark UNet model. SAM-CONV showcases remarkable superiority, achieving F1-scores and Dice coefficients exceeding 0.84 and 0.82, respectively. Its exceptional performance in pixel-level labeling emphasizes its robustness in delineating roads from surrounding environments, surpassing established benchmarks and demonstrating its applicability in fine-grained analysis.

In conclusion, SAM-driven transfer learning methods hold promise for robust remote sensing analysis. SAM-ViT excels in image classification, while SAM-CONV demonstrates superiority in semantic segmentation, paving the way for their practical use in real-world remote sensing applications despite limited labeled data availability.

How to cite: Albughdadi, M., Baousis, V., Kaprol, T., Karatosun, A., and Pisa, C.: Exploring Transfer Learning Using Segment Anything Model in Optical Remote Sensing, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-5769,, 2024.

Remote sensing image scene classification is to annotate semantic categories for image areas covering multiple land cover types, reflecting the spatial aggregation of relevant social resources among feature objects, which is one of the remote sensing interpretation tasks with higher challenges for algorithms to understand the images. Nowadays, scene semantic information extraction of images using deep neural networks is also one of the hot research directions. In comparison to other algorithms, deep neural networks can better capture semantic information in images to achieve higher classification accuracy involved in applications such as urban planning. In recent years, multi-modal models represented by image-text have achieved satisfactory performance in downstream tasks. The introduction of "multi-modal" in the field of remote sensing research should not be limited to the use of multi-source data, but more importantly to the coding of diverse data and the extracted deep features based on the huge amount of data. Therefore, in this paper, based on an image-text matching model, we establish a multi-modal scene classification model (Fig. 1) for high spatial resolution aerial images which is dominated by image features and text provides facilitation for the representation of image features. The algorithm first employs self-supervised learning of the visual model, to align the expression domain of the image features obtained from training on natural images with that of our particular dataset, which will help to improve the feature extraction effectiveness of the aerial survey images on the visual model. The features generated by the pre-trained image encoding model and the text encoding model will be further aligned and some of the parameters in the image encoder will be iteratively updated during training. A valid classifier is designed at the end of the model to implement the scene classification task. Through experiments, it was found that our algorithm has a significant improvement effect on the task of scene categorization on aerial survey images compared to single visual models. The model presented in the article obtained precision and recall of above 90% on the test dataset, contained in the high spatial resolution aerial survey images dataset we built with 27 categories (Fig. 2).

Fig 1. Diagram of the proposed model structure. Blue boxes are associated with the image, green boxes with the text, and red boxes with both image and text.

Fig 2. Samples in our high spatial resolution aerial survey images dataset.

How to cite: He, L., Lin, Y., and Song, Y.: A multi-modal high spatial resolution aerial imagery scene classification model with visual enhancement, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6107,, 2024.

EGU24-10914 | Posters virtual | ESSI1.1

Efficient adaptation of Foundation Models for Visual Grounding Remote Sensing task 

Ali J. Ghandour, Hasan Moughnieh, Mohammad Hasan Zahweh, Hasan Nasrallah, Mustafa Shukor, Cristiano Nattero, and Paolo Campanella

Foundation models have demonstrated impressive proficiency across multiple domains, including language, vision, and multi-modal applications, establishing new standards for efficiency and adaptability. In the context of localization-based foundational models, the core strength of such models is their ability to precisely recognize and locate objects across a diverse set of objects in wide-area scenes. This precision is particularly vital in the Remote Sensing (RS) field. The multimodality aspect of these models becomes pivotal in RS, as they can process and interpret complex data, allowing for more comprehensive aerial and satellite image analysis.

Multimodality has emerged as a crucial and dynamic area in recent AI developments, finding diverse applications such as image captioning and visual question answering. More related to traditional visual tasks, Visual Grounding (VG) stands out, involving the localization of objects based on textual descriptions. Unlike conventional approaches that train models on predefined and fixed lists of objects, VG allows a model to locate any entity in an image based on diverse textual descriptions, enabling open-vocabulary predictions. Despite notable efforts in developing powerful VG models to solve general benchmarks, there is a need for more exploration into transferring these models to the remote sensing context.

This paper addresses this gap by delving into the task of visual grounding for remote sensing. Our initial exploration reveals that utilizing general pretrained foundational models for RS yields suboptimal performance. After recognizing these limitations, our work systematically investigates various parameter-efficient tuning techniques to fine-tune these models for RS visual grounding applications. The insights and methodologies presented in this paper provide valuable guidance for researchers seeking to adapt pretrained models to the RS domain efficiently. This adaptation marks a substantial advancement in the field, offering a significant stride toward enhancing the applicability of visual grounding in remote sensing scenarios.

How to cite: Ghandour, A. J., Moughnieh, H., Zahweh, M. H., Nasrallah, H., Shukor, M., Nattero, C., and Campanella, P.: Efficient adaptation of Foundation Models for Visual Grounding Remote Sensing task, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-10914,, 2024.

EGU24-11145 | ECS | Posters on site | ESSI1.1

Preliminary analysis of the potentialities of the Segment Anything Model (SAM) in the segmentation of Sentinel-2 imagery for water reservoir monitoring 

Filippo Bocchino, Germana Sergi, Roberta Ravanelli, and Mattia Crespi

Water reservoirs play a crucial role in the supply of freshwater, agricultural irrigation, hydroelectric power generation, and various industrial applications. However, their existence is increasingly threatened by water stress, due to growing water demand, water pollution, and impacts of climate change, including intensified and prolonged droughts. To address this challenge, a sustainable management of water resources is essential, relying on continuous and accurate monitoring of water reservoirs. Modern Earth Observation technologies offer an effective, frequent, and cost-efficient means for monitoring water basins. 

This study focuses on evaluating the potential of the Segment Anything Model (SAM) network (Kirillov et al., 2023), released by Meta AI in April 2023, for segmenting water reservoirs through the processing of satellite images. SAM aims to serve as a foundational segmentation model capable of generalising its segmentation abilities in a zero-shot manner across diverse tasks. Unlike traditionally supervised learning, zero-shot learning enables a model to recognize objects or features it has never seen during the training. Notably, SAM’s application to satellite imagery, a type of images for which it was not specifically trained, poses a unique challenge. 

In this work, SAM was applied to Sentinel-2 multispectral imagery using a "prompt click" approach, where a water-class pixel was pre-selected for each input image. Google Earth Engine facilitated temporal aggregation of Sentinel-2 images on the interest period (from 01/01/2019 to 31/12/2019), creating four RGB median images, one for each three-month period. SAM was independently applied to investigate  each of these four sub-periods. 

Validation was carried out in the Genoa port area to minimise the influence of temporal water level variations, which in turn produce water area changes. Indeed, the use of a port area made it possible to consider a single reference mask for the different sub-periods analysed, greatly simplifying the validation procedure. 

The validation phase revealed SAM’s superior performance in coastlines with regular shapes and undisturbed water (Fig. 1 and Tab. 1), while port areas, characterised by irregular shapes, higher activity and turbidity, yielded less satisfactory results (Fig. 2 and Tab. 2). 

In conclusion, this study highlighted SAM’s limitations, primarily related to the specific nature of satellite images, vastly different from the training data. Limitations include SAM’s training on three-band (R,G,B) and 8-bit images: the first one has led to the impossibility of using all the 13 bands of Sentinel-2 multispectral images and the second one caused the need to reduce the radiometric resolution of the Sentinel-2 images (from 16 bit to 8 bit), both resulting in information loss. Despite these limitations, SAM demonstrated effective segmentation capabilities, especially in simpler and less disturbed coastal areas, comparable to water segmentation algorithms based on spectral indices. Future improvements could be achieved through fine-tuning on satellite images, and applying SAM to  high-resolution ones.







Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A. C., Lo, W.-Y. et al.,2023. Segment anything. arXiv preprint arXiv:2304.02643.

How to cite: Bocchino, F., Sergi, G., Ravanelli, R., and Crespi, M.: Preliminary analysis of the potentialities of the Segment Anything Model (SAM) in the segmentation of Sentinel-2 imagery for water reservoir monitoring, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11145,, 2024.

EGU24-11514 | ECS | Orals | ESSI1.1

Towards Foundation Models for Earth Observation; Benchmarking Datasets and Performance on Diverse Downstream Tasks 

Anna Jungbluth, Matt Allen, Francisco Dorr, Joseph Gallego-Mejia, Laura Martínez-Ferrer, Freddie Kalaitzis, and Raúl Ramos-Pollán

Satellite-based Earth Observation (EO) is crucial for monitoring land changes and natural hazards on a global scale. In addition to optical imagery, synthetic aperture radar (SAR) technology has proven indispensable, since radar pulses can penetrate clouds and detect millimeter changes on the ground surface. While SAR polarimetry data is easily available (e.g. via Google Earth Engine), interferometric products are harder to obtain due to complex pre-processing requirements. 

In general, using the information contained in EO data (both optical and SAR) for specific downstream tasks often requires specialized analysis pipelines that are not easily accessible to the scientific community. In the context of applying machine learning to EO, self-supervised learning (SSL) - machine learning models that learn features in data without being provided with explicit labels - offer great potential to fully leverage the wealth and complexity of the available data.

In this work, we apply self-supervised learning techniques to create pre-trained models that can leverage the features learned from unlabelled EO data for a variety of downstream tasks. More specifically, we pre-train our models on optical imagery (Sentinel-2) or SAR data (Sentinel-1), and fine-tune our models to predict local events (e.g. fires, floods) and annual land characteristics (e.g. vegetation percentage, land cover, biomass). We compare a number of state-of-the-art SSL techniques (MAE1, DINO2, VICReg3, CLIP4) that have shown great performance on standard image or text based tasks. By adapting these models to our use case, we demonstrate the potential of SSL for EO, and show that self-supervised pre-training strongly reduces the requirement for labels.

In addition to the pre-trained models, we provide global benchmarking datasets of EO input data and associated downstream tasks ready for use in machine learning pipelines. Our data contains 25+ TB of co-registered and aligned tiles, covering South America, the US, Europe, and Asia. By comparing how well our pre-trained models perform on unseen data (both regionally and temporally), we investigate the generalizability of SSL techniques for EO research. With this, our work provides a first step towards creating EO foundation models that can predict anything, anywhere on Earth.


1. He, K. et al. Masked Autoencoders Are Scalable Vision Learners. (2021).

2. Caron, M. et al. Emerging Properties in Self-Supervised Vision Transformers. (2021).

3. Bardes, A., Ponce, J. & LeCun, Y. VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning. (2021).

4. Radford, A. et al. Learning Transferable Visual Models From Natural Language Supervision. (2021).

How to cite: Jungbluth, A., Allen, M., Dorr, F., Gallego-Mejia, J., Martínez-Ferrer, L., Kalaitzis, F., and Ramos-Pollán, R.: Towards Foundation Models for Earth Observation; Benchmarking Datasets and Performance on Diverse Downstream Tasks, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11514,, 2024.

EGU24-17934 | Orals | ESSI1.1

The PhilEO Geospatial Foundation Model Suite 

Bertrand Le Saux, Casper Fibaek, Luke Camilleri, Andreas Luyts, Nikolaos Dionelis, Giacomo Donato Cascarano, Leonardo Bagaglini, and Giorgio Pasquali

Foundation Models (FMs) are the latest big advancement in AI that build upon Deep Learning. They have the ability to analyse large volumes of unlabeled Earth Observation (EO) data by learning at scale, identifying complex patterns and trends that may be difficult or even impossible to detect through traditional methods. These models can then be used as a base to create powerful applications that automatically identify, classify, and analyse features in EO data, unlocking the full potential of AI in EO like never before, providing a paradigm shift in the field.

The field of geospatial FMs is blooming with milestones such as Seasonal Contrast (SeCo) [1] or Prithvi [2]. We present the PhilEO Suite: a dataset (the PhilEO Globe), a series of models (the PhilEO Pillars), and an evaluation testbed (the PhilEO Bench).

In particular, the PhilEO Bench [3] is a novel framework to evaluate the performances of the numerous EO FM propositions on a unified set of downstream tasks. Indeed, there is the need now to assess them with respect to their expected qualities in terms of generalisation, universality, label efficiency, and easiness to derive specialised models. The PhilEO Bench comprises a fair testbed bringing independence to external factors and a novel 400GB global, stratified Sentinel-2 dataset containing labels for the three downstream tasks of building density estimation, road segmentation, and land cover classification.



[1] Oscar Manas, et al., “Seasonal Contrast: Unsupervised pre-training from uncurated remote sensing data,” in Proc. ICCV, 2021.

[2] Johannes Jakubik, Sujit Roy, et al., “Foundation Models for Generalist Geospatial Artificial Intelligence,” arxiv:2310.18660, 2023.

[3] Casper Fibaek, Luke Camilleri, Andreas Luyts, Nikolaos Dionelis, and Bertrand Le Saux, “PhilEO Bench: Evaluating Geo-Spatial Foundation Models,” arXiv:2401.04464, 2024.

How to cite: Le Saux, B., Fibaek, C., Camilleri, L., Luyts, A., Dionelis, N., Cascarano, G. D., Bagaglini, L., and Pasquali, G.: The PhilEO Geospatial Foundation Model Suite, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17934,, 2024.

EGU24-18331 | Posters on site | ESSI1.1

Downscaling with the foundation model AtmoRep 

Michael Langguth, Christian Lessig, Martin Schultz, and Ilaria Luise

In recent years, deep neural networks (DNN) to enhance the resolution of meteorological data, known as statistical downscaling, have surpassed classical statistical methods that have been developed previously with respect to several validation metrics. The prevailing approach for DNN downscaling is to train deep learning models in an end-to-end manner. However, foundation models trained on very large datasets in a self-supervised way have proven to provide new SOTA results for various applications in natural language processing and computer vision. 

To investigate the benefit of foundation models in Earth Science applications, we deploy the large-scale representation model for atmospheric dynamics AtmoRep (Lessig et al., 2023) for statistical downscaling of the 2m temperature over Central Europe. AtmoRep has been trained on almost 40 years of ERA5 data from 1979 to 2017 and has shown promising skill in several intrinsic and downstream applications. By extending AtmoRep’s encoder-decoder with a tail network for downscaling, we super-resolve the coarse-grained 2 m temperature field from ERA5-data (Δx = 25 km) to attain the high spatial resolution (Δx = 6 km) of the COSMO REA6 dataset. Different coupling approaches between the core and tail network (e.g. with and without fine-tuning the core model) are tested and analyzed in terms of accuracy and computational efficiency. Preliminary results show that downscaling with a task-specific extension of the foundation model AtmoRep can improve the downscaled product in terms of standard evaluation metrics such as the RMSE compared to a task-specific deep learning model. However, deficiencies in the spatial variability of the downscaled product are also revealed, highlighting the need for future work to focus especially on target data that inhibit a high degree of spatial variability and intrinsic uncertainty such as precipitation.

How to cite: Langguth, M., Lessig, C., Schultz, M., and Luise, I.: Downscaling with the foundation model AtmoRep, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18331,, 2024.

EGU24-18852 | ECS | Orals | ESSI1.1

DeepFeatures: Remote sensing beyond spectral indices 

Martin Reinhardt, Karin Mora, Gunnar Brandt, Tejas Morbagal Harish, David Montero, Chaonan Ji, Teja Kattenborn, Francesco Martinuzzi, Clemens Mosig, and Miguel D. Mahecha

Terrestrial surface processes exhibit distinctive spectral signatures captured by optical satellites. Despite the development of over two hundred spectral indices (SIs), current studies often narrow their focus to individual SIs, overlooking the broader context of land surface processes. This project seeks to understand the holistic features of Sentinel-2 based SIs and their relationships with human impact and overall land surface dynamics. To address this, we propose an AI-driven approach that synthesises SIs derived from Sentinel data through dimension reduction, yielding interpretable latent variables describing the system comprehensively. Our goals are to (i) reduce the number of SIs and (ii) compute a few latent variables representing spatio-temporal dynamics, which culminate in a Feature Data Cube. This fully descriptive cube reduces computational costs, facilitating diverse applications. We plan to demonstrate its efficacy in land cover classification, standing deadwood detection, and terrestrial gross primary production estimation. The presentation outlines the project's implementation strategy, confronts methodological challenges, and extends an invitation to the remote sensing and machine learning community to collaborate on pressing environmental challenges. The project DeepFeatures is funded by ESA’s AI4Science activity. Website: 

How to cite: Reinhardt, M., Mora, K., Brandt, G., Morbagal Harish, T., Montero, D., Ji, C., Kattenborn, T., Martinuzzi, F., Mosig, C., and Mahecha, M. D.: DeepFeatures: Remote sensing beyond spectral indices, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18852,, 2024.

EGU24-21146 | Posters on site | ESSI1.1

Segment Anything Model (SAM) for Automatic Crater Detection 

Iraklis Giannakis, Anshuman Bhardwaj, Lydia Sam, and Georgis Leontidis

Impact craters, resulting from the collision of meteorites, asteroids, or comets with planetary surfaces, manifest as circular-elliptical depressions with diverse sizes and shapes influenced by various factors. These morphological features play a crucial role in planetary exploration, offering insights into the geological composition and structure of celestial bodies. Beyond their scientific importance, craters may also hold valuable natural resources, such as frozen water in the Moon's permanently shadowed craters. Furthermore, understanding craters’ spatial distribution is pivotal for terrain-relative navigation and for selecting future landing sites.

Manual crater mapping through visual inspection is an impractical and laborious process, often unattainable for large-scale investigations. Moreover, manual crater mapping is susceptible to human errors and biases, leading to potential disagreements of up to 40%. In order to tackle these issues, semi-automatic crater detection algorithms (CDA) have been developed to mitigate human biases, and to enable large-scale and real-time crater detection and mapping.

The majority of CDAs’ are based on machine learning (ML) and data-driven methods. ML-based CDAs’ are trained in a supervised manner using specific datasets that were manually labelled. Because of that, existing ML-based CDAs’ are constrained to specific data types according to the type of their training data. This makes current ML-based CDAs’ unstable and un-practical, since applying an ML scheme to a different type of data requires acquiring and labelling a new training set, and subsequently use it to train a new ML scheme, or fine-tune an already existing one.

In this study, we describe a universal approach [1] for crater identification based on Segment Anything Model (SAM), a foundational computer vision and image segmentation model developed by META [2]. SAM was trained with over 1 billion masks, and is capable to segment various data types (e.g., photos, DEM, spectra, gravity) from different celestial bodies (e.g., Moon, Mars) and measurement setups. The segmentation output undergoes further classification into crater and no-crater based on geometric indices assessing circular and elliptical attributes of the investigated mask. The proposed framework is proven effective across different datasets from various planetary bodies and measurement configurations. The outcomes of this study underlines the potential of foundational segmentation models in planetary science. Foundational models tuned for planetary data can provide universal classifiers contributing towards an automatic scheme for identifying, detecting and mapping various morphological and geological targets in different celestial bodies.



[1] Giannakis, I., Bhardwaj, A., Sam, L., Leontidis, G., (2024). A Flexible Deep Learning Crater Detection Scheme Using Segment Anything Model (SAM), Icarus, 2024.

[2] Kirillov, A., et al. (2024). Segment Anything, arXiv:2304.02643

How to cite: Giannakis, I., Bhardwaj, A., Sam, L., and Leontidis, G.: Segment Anything Model (SAM) for Automatic Crater Detection, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-21146,, 2024.

EGU24-22461 | Posters on site | ESSI1.1

Pretraining a foundation model using MODIS observations of the earth’s atmosphere 

Valentine Anantharaj, Takuya Kurihana, Gabriele Padovani, Ankur Kumar, Aristeidis Tsaris, Udaysankar Nair, Sandro Fiore, and Ian Foster

Pretraining a foundation model using MODIS observations of the earth’s atmosphere 

The earth and atmospheric sciences research community has an unprecedented opportunity to exploit the vast amount of data available from earth observation (EO) satellites and earth system models (ESM). Smaller and cheaper satellites with reduced operational costs have made a variety of EO data affordable, and technological advances have made the data accessible to a wide range of stakeholders, especially the scientific community (EY, 2023). The NASA ESDS program alone is expected to host 320 PB of data by 2030 (NASA ESDS, 2023). The ascent and application of artificial intelligence foundation models (FM) can be attributed to the availability of large volumes of curated data, accessibility to extensive compute resources and the maturity of deep learning architectures, especially the transformer (Bommasani et al., 2021). 

Developing a foundation model involves pretraining a suitable deep learning architecture with large amounts of data, often via self supervised learning (SSL) methods. The pretrained models can then be adapted to downstream tasks via fine tuning, requiring less amount of data than task-specific models. Large language models (LLM) are likely the most common type of foundation encountered by the general public. Vision transformers (ViT) are based on the LLM architecture and adapted for image and image-like data (Dosovitskiy, et. al., 2020), such as EO data and ESM simulation output.  We are in the process of pretraining a ViT model for the earth’s atmosphere using a select few bands of 1-km Level-1B MODIS radiances and brightness temperatures, MOD021KM and MYD021KM from the NASA Terra and Aqua satellites respectively. We are using 200 million image chips of size 128x128 pixels. We are pretraining two ViT models of sizes 100 million and 400 million parameters respectively. The pretrained models will be finetuned for cloud classification and evaluated against AICCA. We will discuss our experiences involving data and computing, and present preliminary results.



Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, et al: On the opportunities and risks of foundation models. CoRR abs/2108.07258., 2021. 

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S. and Uszkoreit, J.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.

Ernst & Young (EY): How can the vantage of space give you strategic advantage on Earth?, 2023. Accessed 10 January 2024.

Kurihana, Takuya, Elisabeth J. Moyer, and Ian T. Foster: AICCA: AI-Driven Cloud Classification Atlas. Remote Sensing 14, no. 22: 5690., 2022.

NASA MODIS: MODIS - Level 1B Calibrated Radiances. DOI: 10.5067/MODIS/MOD021KM.061 and DOI: 10.5067/MODIS/MYD021KM.061

NASA ESDS: Earthdata Cloud Evolution Accessed 10 January 2024.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I: Attention is all you need. Adv Neural Inf Process Syst 30, 2017.

How to cite: Anantharaj, V., Kurihana, T., Padovani, G., Kumar, A., Tsaris, A., Nair, U., Fiore, S., and Foster, I.: Pretraining a foundation model using MODIS observations of the earth’s atmosphere, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-22461,, 2024.

EGU24-1198 | ECS | PICO | ESSI1.3

Satellite-Driven Traffic Volume Estimation: Harnessing Hybrid Machine Learning for Sustainable Urban Planning and Pollution Control 

Bilal Aslam, Toby Hocking, Pawlok Dass, Anna Kato, and Kevin Gurney

As cities grow and more cars occupying the roads, greenhouse gas emissions and air pollution in urban areas are going up. To better understand the emissions and pollutions, and help effective urban environmental mitigation, an accurate estimation of traffic volume is crucial. This study delves into the application of Hybrid Machine Learning models to estimate and predict traffic volume by utilizing satellite data and other datasets in both the USA and Europe. The research investigates the predictive capabilities of machine learning models employing freely accessible global datasets, including Sentinel 2, Night-time light data, population, and road density. Neural Network, nearest neighbours, random forest and XGBoost regression models were employed for traffic volume prediction, and their accuracy was enhanced using a hyperparameter-tuned K-Fold Cross-validation technique. Model accuracy, evaluated through Mean Percentage Error (MPE%) and R-square, revealed that XGBoost Regression model yielding an R2 accuracy of 0.81 and MPE of 13%. The low error (and therefore high accuracy) as well as the model's versatility allows its application worldwide for traffic volume computation utilizing readily available datasets. Machine learning models, particularly the XGBoost Regression model, prove valuable for on-road traffic volume prediction, offering a dataset applicable to town planning, urban transportation, and combating urban air pollution.

How to cite: Aslam, B., Hocking, T., Dass, P., Kato, A., and Gurney, K.: Satellite-Driven Traffic Volume Estimation: Harnessing Hybrid Machine Learning for Sustainable Urban Planning and Pollution Control, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1198,, 2024.

Long-term satellite-based imagery provides fundamental data support for identifying and analyzing land surface dynamics. Although moderate-spatial-resolution data, like the Moderate Resolution Imaging Spectroradiometer (MODIS), were widely used for large-scale regional studies, their limited availability before 2000 restricts their usage in long-term investigations. To reconstruct retrospective MODIS-like data, this study proposes a novel deep learning-based model, named the Land-Cover-assisted SpatioTemporal Fusion model (LCSTF). LCSTF leverages medium-grained spatial class features from Landcover300m and temporal seasonal fluctuations from the Global Inventory Modelling and Mapping Studies (GIMMS) NDVI3g time series data to generate 500-meter MODIS-like data from 1992 to 2010 over the continental United States. The model also implements the Long Short-Term Memory (LSTM) sensor-bias correction method to mitigate systematic differences between sensors. Validation against actual MODIS images confirms the model’s ability to produce accurate MODIS-like data. Additionally, when assessed with Landsat data prior to 2000, the model demonstrates excellent performance in reconstructing retrospective data. The developed model and the reconstructed biweekly MODIS-like dataset offer significant potential for extending the temporal coverage of moderate-spatial-resolution data, enabling comprehensive long-term and large-scale studies of land surface dynamics.

How to cite: Zhang, Z., Xiong, Z., Pan, X., and Xin, Q.: Developing a land-cover-assisted spatiotemporal fusion model for producing pre-2000 MODIS-like data over the continental United States, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2021,, 2024.

EGU24-4445 | PICO | ESSI1.3

Fully Differentiable Physics-informed Lagrangian Convolutional Neural Network for Precipitation Nowcasting 

Peter Pavlík, Martin Výboh, Anna Bou Ezzeddine, and Viera Rozinajová

The task of precipitation nowcasting is often perceived as a computer vision problem. It is analogous to next frame video prediction - i.e. processing consecutive radar precipitation map frames and predicting the future ones. This makes convolutional neural networks (CNNs) a great fit for this task. In the recent years, the CNNs have become the de-facto state-of-the-art model for precipitation nowcasts.

However, a pure machine learning model has difficulties to capture accurately the underlying patterns in the data. Since the data behaves according to the known physical laws, we can incorporate this knowledge to train more accurate and trustworthy models.

We present a double U-Net model, combining a continuity-constrained Lagrangian persistence U-Net with an advection-free U-Net dedicated to capturing the precipitation growth and decay. In contrast to previous works, the combined model is fully differentiable, allowing us to fine-tune these models together in a data-driven way. We examine the learned Lagrangian mappings, along with a thorough quantitative and qualitative evaluation. The results of the evaluation will be provided in the presentation.

How to cite: Pavlík, P., Výboh, M., Bou Ezzeddine, A., and Rozinajová, V.: Fully Differentiable Physics-informed Lagrangian Convolutional Neural Network for Precipitation Nowcasting, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4445,, 2024.

EGU24-10278 | ECS | PICO | ESSI1.3

Assessing the area of applicability of spatial prediction models through a local data point density approach 

Fabian Schumacher, Christian Knoth, Marvin Ludwig, and Hanna Meyer

Machine learning is frequently used in the field of earth and environmental sciences to produce spatial or spatio-temporal predictions of environmental variables based on limited field samples - increasingly even on a global scale and far beyond the extent of available training data. Since new geographic space often goes along with new environmental properties, the spatial applicability and transferability of models is often questionable. Predictions should be constrained to environments that exhibit properties the model has been enabled to learn.

Meyer and Pebesma (2021) have made a first proposal to estimate the area of applicability (AOA) of spatial prediction models. Their method is based on distances - in the predictor space - of the prediction data point to the nearest reference data point to derive a dissimilarity Index (DI). Prediction locations with a DI larger than DI values observed through cross-validation during model training are considered outside of the AOA. As a consequence, the AOA is defined as the area where the model has been enabled to learn about relationships between predictors and target variables and where, on average, the cross-validation performance applies. The method, however, is only based on the distance - in the predictor space - to the nearest reference data point. Hence, a single data point in an environment may define a model as “applicable” in this environment. Here we suggest extending this approach by considering the densitiy of reference data points in the predictor space, as we assume that this is highly decisive for the prediction quality.

We suggest extending the methodology with a newly developed local data point density (LPD) approach based on the given concepts of the original method to allow for a better assessment of the applicability of a model. The LPD is a quantitative measure for a new data point that indicates how many similar (in terms of predictor values) reference data points have been included in the model training, assuming a positive relationship between LPD values and prediction performance. A reference data point is considered similar if it defines a new data point as being within the AOA, i.e. the model is considered applicable for the corresponding prediction location. We implemented the LPD approach in the R package CAST. Here we explain the method and show its applicability in simulation studies as well as real-world applications.


Meyer, H; Pebesma, E. 2021. ‘Predicting into unknown space? Estimating the area of applicability of spatial prediction models.’ Methods in Ecology and Evolution 12: 1620–1633. doi: 10.1111/2041-210X.13650.

How to cite: Schumacher, F., Knoth, C., Ludwig, M., and Meyer, H.: Assessing the area of applicability of spatial prediction models through a local data point density approach, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-10278,, 2024.

With the rapid growth in global trade, the demand for efficient route planning and resource utilization in logistics and transportation mirrors the Travelling Salesman Problem (TSP). TSP refers to finding the shortest route possible of N destinations by visiting each destination once and returning to the starting point. Moreover, the computational complexity of TSP increases exponentially with the number of destinations, where finding an exact solution is not practical in larger instance. It has long been a challenging optimization problem, prompting the development of various methodologies to seek for more efficient solution, especially towards metaheuristics in recent research. Therefore, this research proposes an optimization algorithm with the implementation of the Swarm Intelligence-based method for solving TSP, providing an approximate solution. The proposed algorithm is evaluated by comparing its performance in terms of solution quality and computation time to well-known optimization methods, namely the Genetic Algorithm and the Ant Colony Optimization. 47 cities and 50 landmarks in the U.S. are selected as the destinations for two experimental datasets respectively with geospatial data retrieved from Google Maps Platform API. The experiment result suggests that the proposed algorithm has computed a near-optimal solution along with the shortest computation time among the three optimization methods. Solving the TSP efficiently contributes significantly to route planning for transportation and logistics. By shortening the travelling time, optimizing resource utilization, and minimizing fuel and energy consumption, this research further aligns with the global goal of carbon reduction for transportation and logistics systems.

How to cite: Wong, K. T.: Solving the Travelling Salesman Problem for Efficient Route Planning through Swarm Intelligence-Based Optimization, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13452,, 2024.

Land surface temperature (LST) is a critical parameter for understanding the physical properties of the boundary between the earth's surface and the atmosphere, and it has a significant impact on various research areas, including agriculture, climate, hydrology, and the environment. However, the thermal infrared band of remote sensing is often hindered by clouds and aerosols, resulting in gaps in LST data products, which hinders the practical application of these products. Therefore, reconstruction of cloud-covered thermal infrared LST is vital for the measurement of physical properties in land surface at regional and global scales. In this paper, a novel reconstruction method for Moderate Resolution Imaging Spectroradiometer (MODIS) LST data with a 1-km spatial resolution is proposed by a spatiotemporal consistency constraint network (STCCN) model fusing reanalysis and thermal infrared data. Firstly, a new spatio-temporal consistency loss function was developed to minimize the discrepancies between the reconstructed LST and the actual LST, by using a non-local reinforced convolutional neural network. Secondly, ERA5 surface net solar radiation (SSR) data was applied as one of the important factors for network inputs, it can characterize the influence of the Sun on surface warming and correct the LST reconstruction results. The experimental results show that (1) the STCCN model can precisely reconstruct cloud-covered LST, the coefficient of determination (R) is 0.8973 and the mean absolute error (MAE) is 0.8070 K; (2) with the introduction of ERA5 SSR data, the MAE of reconstructed LST decreases by 17.15% while the R is kept close, indicating that it is necessary and beneficial to consider the effects of radiation data on LST; (3) the analysis of spatial and temporal adaptability indicates that the proposed method exhibits strong resilience and flexibility in accommodating variations across different spatial and temporal scales, suggesting its potential for effective and reliable application in different scenarios; (4) referring to the SURFRAD station observations, the reconstructed R ranges from 0.8 to 0.9, and MAE ranges from 1 to 3 K, demonstrating the high effectiveness and validity of the proposed model for reconstructing regional cloud-covered LST.

How to cite: Gong, Y., Li, H., and Li, J.: STCCN: A spatiotemporal consistency constraint network for all-weather MODIS LST reconstruction by fusing reanalysis and thermal infrared data, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13833,, 2024.

EGU24-16841 | PICO | ESSI1.3

Photovoltaic Farms Mapping using openEO Platform 

Mohammad Alasawedah, Michele Claus, Alexander Jacob, Patrick Griffiths, Jeroen Dries, and Stefaan Lippens

Photovoltaic farms (PV farms) mapping is essential for establishing valid policies regarding natural resources management and clean energy. As evidenced by the recent COP28 summit, where almost 120 global leaders pledged to triple the world’s renewable energy capacity before 2030, it is crucial to make these mapping efforts scalable and reproducible. Recently, there were efforts towards the global mapping of PV farms [1], but these were limited to fixed time periods of the analyzed satellite imagery and not openly reproducible.  Building on this effort, we propose the use of openEO [2] User Defined Processes (UDP) implemented in openEO platform for mapping solar farms using Sentinel-2 imagery, emphasizing the four foundational FAIR data principles: Findability, Accessibility, Interoperability, and Reusability. The UDPs encapsulate the entire workflow including solar farms mapping, starting from data preprocessing and analysis to model training and prediction. The use of openEO UDPs enables easy reuse and parametrization for future PV farms mapping.  

Open-source data is used to construct the training dataset, leveraging OpenStreetMap (OSM) to gather PV farms polygons across different countries. Different filtering techniques are involved in the creation of the training set, in particular land cover and terrain. To ensure model robustness, we leveraged the temporal resolution of Sentinel-2 L2A data and utilized openEO to create a reusable workflow that simplifies the data access in the cloud, allowing the collection of training samples over Europe efficiently. This workflow includes preprocessing steps such as cloud masking, gap filling, outliers filtering as well as feature extraction. Alot of effort is put in the best training samples generation, ensuring an optimal starting point for the subsequent steps. After compiling the training dataset, we conducted a statistical discrimination analysis of different pixel-level models to determine the most effective one. Our goal is to compare time-series machine learning (ML) models like InceptionTime, which uses 3D data as input, with tree-based models like Random Forest (RF), which employs 2D data along with feature engineering. An openEO process graph is then constructed to organize and automate the execution of the inference phase, encapsulating all necessary processes from the preprocessing to the prediction stage. Finally, the process graph is transformed into a reusable UDP that can be reused by others for replicable PV farms mapping, from single farm to country scale. The use of the openEO UDP enables replications of the workflow to map new temporal assessments of PV farms distribution. The UDP process for the PV farms mapping is integrated with the ESA Green Transition Information Factory (GTIF,, providing the ability for streamlined and FAIR compliant updates of related energy infrastructure mapping efforts. 

[1] Kruitwagen, L., et al. A global inventory of photovoltaic solar energy generating units. Nature 598, 604–610 (2021). 

[2] Schramm, M, et al. The openEO API–Harmonising the Use of Earth Observation Cloud Services Using Virtual Data Cube Functionalities. Remote Sens. 2021, 13, 1125. 

How to cite: Alasawedah, M., Claus, M., Jacob, A., Griffiths, P., Dries, J., and Lippens, S.: Photovoltaic Farms Mapping using openEO Platform, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-16841,, 2024.

EGU24-17458 | PICO | ESSI1.3

Spatially explicit active learning for crop-type mapping from satellite image time series 

Mariana Belgiu, Beatrice Kaijage, and Wietske Bijker

The availability of sufficient annotated samples is one of the main challenges of the supervised methods used to classify crop types from remote sensing images. Generating a large number of annotated samples is a time-consuming and expensive task. Active Learning (AL) is one of the solutions that can be used to optimize the sample annotation, resulting in an efficiently trained supervised method with less effort. Unfortunately, most of the developed AL methods do not account for the spatial information inherent in remote-sensing images. We propose a novel spatially-explicit AL that uses a semi-variogram to identify and discard the spatially adjacent and, consequently, redundant samples. It was evaluated using Random Forest (RF) and Sentinel-2 Satellite Image Time Series (SITS) in two study areas from the Netherlands and Belgium. In the Netherlands, the spatially explicit AL selected a total number of 97 samples as being relevant for the classification task which led to an overall accuracy of 80%, while the traditional AL method selected a total number of 169 samples achieving an accuracy of 82%. In Belgium, spatially explicit AL selected 223 samples and obtained an overall accuracy of 60%, compared to the traditional AL that selected 327 samples which yielded an accuracy of 63%. We concluded that the developed AL method helped RF achieve a good performance mostly for the classes consisting of individual crops with a relatively distinctive growth pattern such as sugar beets or cereals. Aggregated classes such as ‘fruits and nuts’ represented, however, a challenge. The proposed AL method reveals that accounting for spatial information is an efficient solution to map target crops since it facilitates high accuracy with a low number of samples and, consequently, lower computational resources and time and financial resources for annotation.

How to cite: Belgiu, M., Kaijage, B., and Bijker, W.: Spatially explicit active learning for crop-type mapping from satellite image time series, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17458,, 2024.

EGU24-19275 | PICO | ESSI1.3

Artificial Intelligence Reconstructs Historical Climate Extremes 

Étienne Plésiat, Robert Dunn, Markus Donat, Thomas Ludwig, and Christopher Kadow

The year 2023 represents a significant milestone in climate history: it was indeed confirmed by the Copernicus Climate Change Service (C3S) as the warmest calendar year in global temperature data records since 1850. With a deviation of 1.48ºC from the 1850-1900 pre-industrial level, 2023 largely surpasses 2016, 2019, 2020, previously identified as the warmest years on record. As expected, this sustained warmth leads to an increase in frequency and intensity of Extreme Events (EE) with dramatic environmental and societal consequences.

To assess the evolution of these EE and establish adaptation and mitigation strategies, it is crucial to evaluate the trends of extreme indices (EI). However, the observational climate data that are commonly used for the calculation of these indices frequently contains missing values, resulting in partial and inaccurate EI. As we delve deeper into the past, this issue becomes more pronounced due to the scarcity of historical measurements.

To circumvent the lack of information, we are using a deep learning technique based on a U-Net made of partial convolutional layers [1]. Models are trained with Earth system model data from CMIP6 and has the capability to reconstruct large and irregular regions of missing data using minimal computational resources. This approach has shown its ability to outperform traditional statistical methods such as Kriging by learning intricate patterns in climate data [2].

In this study, we have applied our technique to the reconstruction of gridded land surface EI from an intermediate product of the HadEX3 dataset [3]. This intermediate product is obtained by combining station measurements without interpolation, resulting in numerous missing values that varies in both space and time. These missing values affect significantly the calculation of the long-term linear trend (1901-2018), especially if we consider solely the grid boxes containing values for the whole time period. The trend calculated for the TX90p index that measures the monthly (or annual) frequency of warm days (defined as a percentage of days where daily maximum temperature is above the 90th percentile) is presented for the European continent on the left panel of the figure. It illustrates the resulting amount of missing values indicated by the gray pixels. With our AI method, we have been able to reconstruct the TX90p values for all the time steps and calculate the long-term trend shown on the right panel of the figure. The reconstructed dataset is being prepared for the community in the framework of the H2020 CLINT project [4] for further detection and attribution studies.

[1] Liu G. et al., Lecture Notes in Computer Science, 11215, 19-35 (2018)
[2] Kadow C. et al., Nat. Geosci., 13, 408-413 (2020)
[3] Dunn R. J. H. et al., J. Geophys. Res. Atmos., 125, 1 (2020)

How to cite: Plésiat, É., Dunn, R., Donat, M., Ludwig, T., and Kadow, C.: Artificial Intelligence Reconstructs Historical Climate Extremes, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19275,, 2024.

EGU24-19394 | PICO | ESSI1.3

Comparing the Role of Spatially and Temporally capable Deep Learning Architectures in Rainfall Estimation: A Case Study over North East India 

Aditya Handur-Kulkarni, Shanay Mehta, Ayush Ghatalia, and Ritu Anilkumar

The northeastern states of India are faced with heavy-precipitation related disasters such as floods and landslides every monsoon. Further, the region's economy is predominantly dependent on agriculture. Thus, accurate prediction of rainfall plays a vital role in the planning and disaster management programs in the region. Existing methods used for rainfall prediction include Automatic Weather Stations that provide real-time rainfall measurements at specific locations. However, these are point-based estimates. For distributed measurements, a satellite-based estimation can be used. While these methods provide vital information on the spatial distribution of precipitation, they face the caveat that they provide only real-time estimates. Numerical weather forecast models are used for encoding forecasting capabilities by simulating the atmosphere's physical processes through data assimilation of observational data from various sources, including weather stations and satellites. However, these models are incredibly complex and require immense computational strength. The veracity of the numerical models is limited by available computing architecture. Recently, a host of data-driven models, including random forest regression, support vector machine regression and deep learning architectures, have been used to provide distributed rainfall forecasts. However, the relative performance of such models in an orographically complex terrain has not been ascertained via a disciplined study. Through this study, we aim to systematically assess the role of convolutional and recurrent neural network architectures in estimating rainfall. We have used rainfall data from the ERA5 Land reanalysis dataset and data from the following additional meteorological variables that can impact rainfall: dew point temperature, skin temperature, amount of solar radiation, wind components, surface pressure and total precipitation. The data aggregated on a daily scale and spanning three decades was selected for this study. We have used the following architectures of neural network algorithms: U-Net architecture modified for regression representing convolutional neural networks and Long Short-Term Memory (LSTM) architecture representing the recurrent neural networks. Various settings of each architecture, such as the number of layers, optimizers and initialization, are validated to assess their performance on rainfall estimation. The developed rainfall estimation models were validated and evaluated using rigorous statistical metrics, such as root mean square error (RMSE) and coefficient of determination (R-squared). The results of this research are expected to provide valuable insights for local governments, farmers, and other stakeholders in the northeastern states of India. Moreover, the study's methodology can be extended to other regions facing similar climate challenges, thus contributing to advancements in the field of rainfall estimation and climate modelling.

How to cite: Handur-Kulkarni, A., Mehta, S., Ghatalia, A., and Anilkumar, R.: Comparing the Role of Spatially and Temporally capable Deep Learning Architectures in Rainfall Estimation: A Case Study over North East India, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19394,, 2024.

EGU24-19531 | PICO | ESSI1.3

Gradient-Based Optimisers Versus Genetic Algorithms in Deep Learning Architectures: A Case Study on Rainfall Estimation Over Complex Terrain 

Yash Bhisikar, Nirmal Govindaraj, Venkatavihan Devaki, and Ritu Anilkumar

Gradient-Based Optimisers Versus Genetic Algorithms in Deep Learning Architectures:

A Case Study on Rainfall Estimation Over Complex Terrain


Yash Bhisikar1*, Nirmal Govindaraj1*, Venkatavihan Devaki2*, Ritu Anilkumar3

1Birla Institute of Technology And Science, Pilani, K K Birla Goa Campus 

2Birla Institute of Technology And Science, Pilani, Pilani Campus 

3North Eastern Space Applications Centre, Department of Space, Umiam


* Authors have contributed equally to this study.

Rainfall is a crucial factor that affects planning processes at various scales, ranging from agricultural activities at the village or residence level to governmental initiatives in the domains of water resource management, disaster preparedness, and infrastructural planning. Thus, a reliable estimate of rainfall and a systematic assessment of variations in rainfall patterns is the need of the hour. Recently, several studies have attempted to predict rainfall over various locations using deep learning architectures, including but not limited to artificial neural networks, convolutional neural networks, recurrent neural networks, or a combination of these. However, a major challenge in the estimation of rainfall is the chaotic nature of rainfall, especially the interplay of spatio-temporal components over orographically complex terrain. For complex computer vision challenges, studies have suggested that population search-driven optimisation techniques such as genetic algorithms may be used in the optimisation as an alternative to traditional gradient-based techniques such as Adam, Adadelta and SGD. Through this study, we aim to extend this hypothesis to the case of rainfall estimation. We integrate the use of population search-based techniques, namely genetic algorithms, to optimise a convolutional neural network architecture built using PyTorch. We have chosen the study area of North-East India for this study as it receives significant monsoon rainfall and is impacted by the undulating terrain that adds complexity to the rainfall estimation. We have used 30 years of rainfall data from the ERA5 Land daily reanalysis dataset with a spatial resolution of 11,132 m for the months of June, July, August and September. Additionally, datasets of the following meteorological variables that can impact rainfall were utilised as input features: dew point temperature, skin temperature, net incoming short-wave radiation received at the surface, wind components and surface pressure. All the datasets are aggregated to daily time steps. Several configurations of the U-Net architecture, such as the number of hidden layers, initialisation techniques and optimisation algorithms, have been used to identify the best configuration in the estimation of rainfall for North-East India. Genetic algorithms were used in initialisation and optimisation to assess the ability of population search heuristics using the PyGAD library. The developed rainfall prediction models were validated at different time steps (0-day, 1-day, 2-day and 3-day latency) on a 7:1:2 train, validation, test dataset split for evaluation metrics such as root mean square error (RMSE) and coefficient of determination (R-squared). The evaluation was performed on a pixel-by-pixel basis as well as an image-by-image basis in order to take magnitude and spatial correlations into consideration. Our study emphasises the importance of considering alternate optimising functions and hyperparameter tuning approaches for complex earth observation challenges such as rainfall prediction.

How to cite: Bhisikar, Y., Govindaraj, N., Devaki, V., and Anilkumar, R.: Gradient-Based Optimisers Versus Genetic Algorithms in Deep Learning Architectures: A Case Study on Rainfall Estimation Over Complex Terrain, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19531,, 2024.

EGU24-20025 | ECS | PICO | ESSI1.3

Vineyard detection from multitemporal Sentinel-2 images with a Transformer model 

Weiying Zhao, Alexey Unagaev, and Natalia Efremova

This study introduces an innovative method for vineyard detection by integrating advanced machine learning techniques with high-resolution satellite imagery, particularly focusing on the use of preprocessed multitemporal Sentinel-2 images combined with a Transformer-based model.

We collected a series of Sentinel-2 images over an entire seasonal cycle from eight distinct locations in Oregon, United States, all within similar climatic zones. The training and validation database sizes are 403612 and 100903, respectively. To reduce the cloud effect, we used the monthly median band values derived from initially cloud-filtered images.  The multispectral (12 bands) and multiscale (10m, 20m, and 60m) time series were effective in capturing both the phenological patterns of the land covers and the overall management activities.

The Transformer model, primarily recognized for its successes in natural language processing tasks, was adapted for our time series identification scenario. Then, we transferred the object detection into a binary classification task. Our findings demonstrate that the Transformer model significantly surpasses traditional 1D convolutional neural networks (CNNs) in detecting vineyards across 16 new areas within similar climatic zones, boasting an impressive accuracy of 87.77% and an F1 score of 0.876. In the majority of these new test locations, the accuracy exceeded 92%, except for two areas that experienced significant cloud interference and presented numerous missing values in their time series data. This model proved its capability to differentiate between land covers with similar characteristics during various stages of growth throughout the season. Compared with attention LSTM and BiLSTM, it has less trainable parameters when getting a similar performance. The model was especially adept at handling temporal variations, elucidating the dynamic changes in vineyard phenology over time. This research underscores the potential of combining advanced machine learning techniques with high-resolution satellite imagery for crop type detection and suggests broader applications in land cover classification tasks. Future research will pay more attention to the missing value problem.

How to cite: Zhao, W., Unagaev, A., and Efremova, N.: Vineyard detection from multitemporal Sentinel-2 images with a Transformer model, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20025,, 2024.

EGU24-22153 | PICO | ESSI1.3

Spatial cross-validation of wheat yield estimations using remote sensing and machine learning 

Keltoum Khechba, Mariana Belgiu, Ahmed Laamrani, Qi Dong, Alfred Stein, and Abdelghani Chehbouni

Integration of Machine Learning (ML) with remote sensing data has been successfully used to create detailed agricultural yield maps at both local and global scales. Despite this advancement, a critical issue often overlooked is the presence of spatial autocorrelation in geospatial data used for training and validating ML models. Usually random cross-validation (CV) methods are employed that fail to account for this aspect. This study aimed to assess wheat yield estimations using both random and spatial CV. In contrast to random CV where the data is split randomly, spatial CV involves splitting the data based on spatial locations, to ensure that spatially close data points are grouped together, either entirely in the training or in the test set, but not both. Conducted in Northern Morocco during the 2020-2021 agricultural season, our research uses Sentinel 1 and Sentinel 2 satellite images as input variables as well as 1329 field data locations to estimate wheat yield. Three ML models were employed: Random Forest, XGBoost, and Multiple Linear Regression. Spatial CV was employed across varying spatial scales. The province represents predefined administrative division, while grid2 and grid1 are equally sized spatial blocks, with a spatial resolution of 20x20km and 10x10 km respectively. Our findings show that when estimating yield with Random CV, all models achieve higher accuracies (R² = 0.58 and RMSE = 840 kg ha-1 for the XGBoost model) as compared to the performance reported when using spatial CV. The10x10 km spatial CV led to the highest R² value equal to 0.23 and an RMSE value equal to 1140 kg ha-1 for the XGBoost model, followed by the 20x20 km grid-based strategy (R² = 0.11 and RMSE = 1227 kg ha-1 for the XGBoost model). Province-based spatial CV resulted in the lowest accuracy with an R² value equal to 0.032 and an RMSE value of 1282 kg ha-1. These results confirm that spatial CV is essential in preventing overoptimistic model performance. The study further highlights the importance of selecting an appropriate CV method to ensure realistic and reliable results in wheat yield predictions as increased accuracy can deviate from real-world conditions due to the effects of spatial autocorrelation.  

How to cite: Khechba, K., Belgiu, M., Laamrani, A., Dong, Q., Stein, A., and Chehbouni, A.: Spatial cross-validation of wheat yield estimations using remote sensing and machine learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-22153,, 2024.

The aim of this study conducted in Tavira - Portugal, is to show the ability to determine depths without relying on in-situ data. To achieve this goal, a model previously trained with depth data and multispectral images from 2018 was used. This model enables depth determination for any period, providing multispectral images.

For this study, Cube satellite images from the PlanetScope constellation with a spatial resolution of 3.0 m and four spectral bands (blue, green, red, and near-infrared) were used. Corrections due to tidal height were obtained through modeled data provided by the Portuguese Hydrographic Institute for the tide gauge of Faro – Olhão. In-situ depths were obtained through the Digital Elevation Model of Reference (MDER) from the Coastal Monitoring Program of Continental Portugal of the Portuguese Environmental Agency.

The model used to determine depths was previously obtained using the Random Forest (RF) algorithm, trained with a set of reflectances from 15 images acquired between August and October 2018 by the PlanetScope constellation, and a set of depths from the MDER, referring to October 2018.

This RF model allowed the depth determination for a set of 7 images from the same constellation, acquired between August and October 2019. The results were corrected for tidal height to obtain all values in relation to the Hydrographic Zero reference. The Savitzky-Golay filter was applied to smooth the results, and then the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm was applied to eliminate outliers. Finally, the median depth value was determined, resulting in a bathymetric surface morphologically similar to the MDER (2019).

This final surface was compared with the 2019 MDER through differences between the two surfaces (residuals) and the respective statistics were calculated (mean, median, standard deviation, and histogram). A vertical profile between 0.0 and 10.0 meters of depth was also generated. The statistical results of the differences reveal a median of 0.5 meters, a mean of 0.7 meters, and a standard deviation of 1.3 meters. The histogram of differences between the two surfaces follows a normal distribution, with its center located at the median value, which is offset from zero.

The results obtained in this study are promising for obtaining depths in coastal regions through multispectral images without the need for in-situ data. However, we are aware that improving the current model is important to reduce the median and standard deviation of the differences between the determined depth and the reference. Enhancing the model will lead to more accurate results, enabling the determination of seasonal variations and changes caused by extreme events or climate alterations without in-situ data.

How to cite: Santos, R. and Quartau, R.: Predicting bathymetry in shallow regions using a machine learning model and a time series of PlanetScope images, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-22165,, 2024.

EGU24-1396 | Orals | ESSI1.4

Development of an approach based on historical Landsat data for delineating Canadian flood zones at different return periods 

Karem Chokmani, Haythem Zidi, Anas El Alem, and Jasmin Gill-Fortin

The study addresses the need for flood risk anticipation and planning, through the development of a flood zone mapping approach for different return periods, in order to best prevent and protect populations. Today, traditional methods are too costly, too slow or require too many requirements to be applied over large areas. As part of a project funded by the Canadian Space Agency, Geosapiens and the Institut National de la Recherche Scientifique set themselves the goal of designing an automatic process to generate water presence maps for different return periods at a resolution of 30 m, based on the historical database of Landsat missions from 1982 to the present day. This involved the design, implementation and training of a deep learning algorithm model based on the U-Net architecture for the detection of water pixels in Landsat imagery. The resulting maps were used as the basis for applying a frequency analysis model to fit a probability of occurrence function for the presence of water at each pixel. The frequency analysis data were then used to obtain maps of water occurrence at different return preiods such as 2, 5 and 20 years. 

How to cite: Chokmani, K., Zidi, H., El Alem, A., and Gill-Fortin, J.: Development of an approach based on historical Landsat data for delineating Canadian flood zones at different return periods, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1396,, 2024.

EGU24-2378 | Orals | ESSI1.4

Applications of GeoAI in Extracting National Value-Added Products from Historical Airborne Photography 

Mozhdeh Shahbazi, Mikhail Sokolov, Ella Mahoro, Victor Alhassan, Evangelos Bousias Alexakis, Pierre Gravel, and Mathieu Turgeon-Pelchat

Canadian national air photo library (NAPL) comprises millions of historical airborne photographs dating over 100 years. Historical photographs are rich chronicles of countrywide geospatial information. They can be used for creating long-term time series and supporting various analytics such as monitoring expansion/shrinking rates of built areas, forest structure change measurement, measuring thinning and retreating rates of glaciers, and determining rates of erosion at coastlines. Various technical solutions are developed at Natural Resources Canada (NRCan) to generate analysis-ready mapping products from NAPL.

Photogrammetric Processing with a Focus on Automated Georeferencing of Historical Photos: The main technical challenge of photogrammetric processing is identifying reference observations, such as ground control points (GCP). Reference observations are crucial to accurately georeference historical photos and ensure the spatial alignment of historical and modern mapping products. This is critical for creating time series and performing multi-temporal change analytics. In our workflow, GCPs are identified by automatically matching historical images to modern optical satellite/airborne ortho-rectified images. In the matching process, first, we use convolutional neural networks (D2Net) for joint feature detection and description in the intensity space. Then, we convert intensity images to phase congruency maps, which show less sensitivity to nonlinear radiometric differences of the images, and we extract an additional set of features using the Fast detector and describe them using the radiation-invariant feature transform (RIFT). Feature-matching outliers are detected and removed via random sample consensus (Ransac), enforcing a homographic transformation between corresponding images. The remaining control points are manually verified through a graphical interface built as a QGIS plugin. The verified control points are then used in a bundle block adjustment, where external orientation parameters of the historical images and the intrinsic calibration parameters of the cameras are refined, followed by dense matching and generation of digital elevation models and ortho-rectified mosaics using conventional photogrammetric approaches. These solutions are implemented using our in-house libraries as well as MicMac open-source software. Through the presentation, examples of the generated products and their qualities will be demonstrated.

Deep Colourization, Super Resolution and Semantic Segmentation: Considering the fact that NAPL mostly contains grayscale photos, their visual appeal and interpretability are less than modern colour images. In addition, the automated extraction of colour-sensitive features from them, e.g. water bodies, is more complicated than colour images. With this regard, we have developed fully automated approaches to colourize historical ortho-rectified mosaics based on image-to-image translation models. Through the presentation, the performance of a variety of solutions like conditional generative adversarial networks (GAN), encoder-decoder networks, vision transformers, and probabilistic diffusion models will be compared. In addition, using a customized GAN, we improve the spatial resolution of historical images which are scanned from printed photos at low resolution (as opposed to being scanned directly from film rolls at high resolution). Our semantic segmentation models, trained initially on optical satellite and airborne imagery, are also adapted to historical air photos for extracting water bodies, road networks, building outlines, and forested areas. The performance of these models on historical photos will be demonstrated during the presentation.

How to cite: Shahbazi, M., Sokolov, M., Mahoro, E., Alhassan, V., Bousias Alexakis, E., Gravel, P., and Turgeon-Pelchat, M.: Applications of GeoAI in Extracting National Value-Added Products from Historical Airborne Photography, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2378,, 2024.

The National Institute of Geographic and Forest Information (IGN) has developed Artificial Intelligence (AI) models that describe land cover at the pixel level from IGN aerial images. This is part of the production process for the Large Scale and Land Use Reference (OCS GE). This contribution is threefold:

Methodology: the training strategy and the use of these models will be reviewed by focusing on i) the selection of the task performed by the models, ii) the approach for choosing and producing learning samples and iii) the training strategy to generalize to the scale of Metropolitan France. The evaluation of the models using various metrics will also be discussed. Visuals will be provided to illustrate the quality of the results. Furthermore, we will explain how AI products are incorporated into the production of the OCS GE.

Continuous improvement: the models are continuously improved, particularly through the implementation of FLAIR (French Land cover from Aerospace ImageRy) challenges towards the scientific community. The challenges FLAIR#1 and FLAIR#2 dealt with model generalization and domain adaptation as well as data fusion, i.e., how to develop an AI model that can process very high spatial resolution images (e.g., IGN aerial acquisitions) and satellite image time series (Sentinel-2 images) as input. We will both review the challenges implementation and the obtained results, leveraging convolutional and attention-based models, ensembling methods and pseudo-labelling. As the AI model for land cover goes far beyond the context of OCS GE production, additional experiments outside of the challenges will be discussed, allowing the development of additional AI models to process other modalities (very high spatial resolution satellite images, historical images, etc.).

Open access: all source code and data, including AI land cover predictions maps, are openly distributed. These resources are distributed via the challenges and as products (CoSIA: Land Cover by Artificial Intelligence) by a dedicated platform, which is of interest for AI users and non-specialists including users from the geoscience and remote sensing community.

How to cite: Garioud, A.: Artificial intelligence for country-scale land cover description., EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6477,, 2024.

EGU24-7635 | Posters on site | ESSI1.4

iMagine, AI-supported imaging data and services for ocean and marine science 

Ilaria Fava, Alvaro Lopez Garcia, Dick Schaap, Tjerk Krijer, Gergely Sipos, and Valentin Kozlov

Aquatic ecosystems are vital in regulating climate and providing resources, but they face threats from global change and local stressors. Understanding their dynamics is crucial for sustainable use and conservation. The iMagine AI Platform offers a suite of AI-powered image analysis tools for researchers in aquatic sciences, facilitating a better understanding of scientific phenomena and applying AI and ML for processing image data.

The platform supports the entire machine learning cycle, from model development to deployment, leveraging data from underwater platforms, webcams, microscopes, drones, and satellites, and utilising distributed resources across Europe. With a serverless architecture and DevOps approach, it enables easy sharing and deployment of AI models. Four providers within the pan-European EGI federation power the platform, offering substantial computational resources for image processing.

Five use cases focus on image analytics services, which will be available to external researchers through Virtual Access. Additionally, three new use cases are developing AI-based image processing services, and two external use cases are kickstarting through recent Open Calls. The iMagine Competence Centre aids use case teams in model development and deployment, resulting in various models hosted on the iMagine AI Platform, including third-party models like YoloV8.

Operational best practices derived from the platform providers and use case developers cover data management, quality control, integration, and FAIRness. These best practices aim to harmonise approaches across Research Infrastructures and will be disseminated through various channels, benefitting the broader European and international scientific communities.

How to cite: Fava, I., Lopez Garcia, A., Schaap, D., Krijer, T., Sipos, G., and Kozlov, V.: iMagine, AI-supported imaging data and services for ocean and marine science, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-7635,, 2024.

EGU24-8420 | Orals | ESSI1.4

Bayesian model averaging of AI models for the high resolution mapping of the forest canopy height 

Nikola Besic, Nicolas Picard, Cédric Vega, Jean-Pierre Renaud, Martin Schwartz, Milena Planells, and Philippe Ciais

The development of high resolution mapping models of forest attributes based on employing machine or deep learning techniques has increasingly accelerated in the last couple of years. The consequence of this is the widespread availability of multiple sources of information, which can either lead to a potential confusion, or to a possibility to get an "extended” insight into the state of our forests by interpreting these sources jointly. This contribution aims at addressing the latter, by relying on the Bayesian model averaging (BMA) approach.

BMA is a method that can be used in building a consensus from an ensemble of different model predictions. It can be seen as weighted mean of different predictions with weights reflecting the predictive performances of different models, or as a finite mixture model which estimates the probability that each observation from the independent validation dataset has been generated by one of the models belonging to the ensemble. BMA can thus be used to diagnose and understand the difference in the predictions and to possibly interpret them.

The predictions in our case are the forest canopy height estimations for the metropolitan France coming from 5 different AI models [1-5], while the independent validation dataset comes from the French National Forest Inventory (NFI) disposing with some 6000 plots per year, distributed across the territory of interest. For every plot we have several measurements/estimations of the forest canopy height out of which the following two are considered in this study: h_m – the maximum total height (from the tree's base level to the terminal bud of the tree's main stem) measured within the plot, and h_dom – the average height of the seven largest dominant trees per hectare.

In this contribution we present for every considered plot the dominant model with respect to both references i.e. the model having the highest probability to be the one generating measurements/estimations at NFI plot (h_m and h_dom). We present as well as the respective inter-model and the intra-model variance estimations, allowing us to propose a series of hypotheses concerning the established differences between predictions of individual models in function of their specificities.

[1] Schwartz, M., et al.: FORMS: Forest Multiple Source height, wood volume, and biomass maps in France at 10 to 30 m resolution based on Sentinel-1, Sentinel-2, and Global Ecosystem Dynamics Investigation (GEDI) data with a deep learning approach, Earth Syst. Sci. Data, 15, 4927–4945, 2023,

[2] Lang, N., et al.: A high-resolution canopy height model of the Earth, Nat Ecol Evol 7, 1778–1789, 2023.

[3] Morin, D. et al.: Improving Heterogeneous Forest Height Maps by Integrating GEDI-Based Forest Height Information in a Multi-Sensor Mapping Process, Remote Sens., 14, 2079. 2022,

[4] Potapov, P., et al.: Mapping global forest canopy height through integration of GEDI and Landsat data, Remote Sensing of Environment, 253, 2021,

[5] Liu, S. et al.: The overlooked contribution of trees outside forests to tree cover and woody biomass across Europe, Sci. Adv. 9, eadh4097, 2023, 10.1126/sciadv.adh4097.

How to cite: Besic, N., Picard, N., Vega, C., Renaud, J.-P., Schwartz, M., Planells, M., and Ciais, P.: Bayesian model averaging of AI models for the high resolution mapping of the forest canopy height, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8420,, 2024.

EGU24-9729 | ECS | Orals | ESSI1.4

Urban 3D Change Detection with Deep Learning: Custom Data Augmentation Techniques 

Riccardo Contu, Valerio Marsocci, Virginia Coletta, Roberta Ravanelli, and Simone Scardapane

The ability to detect changes occurring on the Earth's surface is essential for comprehensively monitoring and understanding evolving landscapes and environments.

To achieve a comprehensive understanding, it is imperative to employ methodologies capable of efficiently capturing and analyzing both two-dimensional (2D) and three-dimensional (3D) changes across various periods.

Artificial Intelligence (AI)  stands out as a primary resource for investigating these alterations, and when combined with Remote sensing (RS) data, it has demonstrated superior performance compared to conventional Change Detection (CD) algorithms.

The recent introduction of the MultiTask Bitemporal Images Transformer [1] (MTBIT) network has made it possible to simultaneously solve 2D and 3D CD tasks leveraging bi-temporal optical images.

However, this network presents certain limitations that necessitate being considered. These constraints encompass a tendency to overfit the training distribution and challenges in inferring extreme values [1]. To address these shortcomings, this work introduces a series of custom augmentations, including strategies like Random Crop, Crop or Resize, Mix up, Gaussian Noise on the 3D CD maps, and Radiometric Transformation. Applied individually or in specific combinations, these augmentations aim to bolster MTBIT's ability to discern intricate geometries and subtle structures that are otherwise difficult to detect.

Furthermore, the evaluation metrics used to assess MTBIT, such as Root Mean Squared Error (RMSE) and the change RMSE (cRMSE), have their limitations. As a response, the introduction of the true positive RMSE (tpRMSE) offers a more comprehensive evaluation, specifically focusing on MTBIT's efficacy in the 3D CD task by considering only the pixels affected by actual elevation changes.

The implementation of custom augmentations particularly when applied in synergy, like Crop or Resize with Gaussian Noise on the 3D map, yielded substantial improvements. These interventions led – through the best augmentation configuration – to the reduction of the cRMSE to 5.88 meters and the tpRMSE to 5.34 meters, compared to the baseline (standard MTBIT) values of 6.33 meters and 5.60 meters, respectively.

The proposed augmentations significantly bolster the practical usability and reliability of MTBIT in real-world applications, effectively addressing critical challenges within the realm of Remote Sensing CD. 



  • [1] Marsocci, V., Coletta, V., Ravanelli, R., Scardapane, S., Crespi, M., 2023. Inferring 3D change detection from bitemporal optical images. ISPRS Journal of Photogrammetry and Remote Sensing, 196, 325-339

How to cite: Contu, R., Marsocci, V., Coletta, V., Ravanelli, R., and Scardapane, S.: Urban 3D Change Detection with Deep Learning: Custom Data Augmentation Techniques, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-9729,, 2024.

EGU24-10034 | Orals | ESSI1.4

Operationalize large-scale point cloud classification: potentials and challenges 

Onur Can Bayrak, Ma Zhenyu, Elisa Mariarosaria Farella, and Fabio Remondino

Urban and natural landscapes are distinguished by different built and vegetated elements with unique features, and their proper identification is crucial for many applications, from urban planning to forestry inventory or natural resources management. With the rapid evolution and deployment of high-resolution airborne and Unmanned Aerial Vehicle (UAV) technologies, large areas can be easily surveyed to create high-density point clouds. Photogrammetric cameras and LiDAR sensors can offer unprecedented high-quality 3D data (a few cm on the ground) that allows for discriminating and mapping even small objects. However, the semantic enrichment of these 3D data is still far from being a fully reliable, accurate, unsupervised, explainable and generalizable process deployable at large scale, on data acquired with any sensor, and at any possible spatial resolution.

This work reports the state-of-the-art and recent developments in urban and natural point cloud classification, with a particular focus on the:

  • Standardization in defining the semantic classes through a multi-resolution and multi-scale approach: a multi-level concept is introduced to improve and optimize the learning process by means of a hierarchical concept to accommodate a large number of classes. 
  • Instance segmentation in very dense areas: closely located and overlapping individual objects require precise segmentation to be accurately identified and classified. We are developing a hierarchical segmentation method specifically designed for urban furniture with small samples to enhance the comprehensiveness of dense urban areas.
  • Generalization of the procedures and transferability of developed models from a fully-labelled domain to an unseen scenario.
  • Handling of under-represented objects (e.g., pole-like objects, pedestrians, and other urban furniture): classifying under-represented objects presents a unique set of challenges due to their sparse occurrence and similar geometric characteristics. We introduce a new method that specifically targets the effective identification and extraction of these objects in combination with knowledge-based methods and deep learning.
  • Available datasets and benchmarks to evaluate and compare learning-based methods and algorithms in 3D semantic segmentation: urban-level aerial 3D point cloud datasets can be classified according to the presence of color information, the number of classes, or the type of sensor used for data gathering. The ISPRS - Vaihingen, DublinCity, DALES, LASDU and CENAGIS-ALS datasets, although extensive in size, do not provide color-related information. Conversely, Campus3D, Swiss3DCities, and Hessigheim3D include color data but feature limited coverage and a few class labels. SensatUrban, STPLS3D, and HRHD-HK were collected across extensive urban regions, but they also present a reduced number of classes. YTU3D surpasses other datasets in terms of class diversity, but it encompasses less extensive areas than SensatUrban, STPLS3D, and HRHD-HK. Despite these differences, the common deficiency in all datasets is the presence of classes with under-represented objects, the limited generalization, and the low accuracy in classifying unbalanced categories, making using these models difficult for real-life scenarios.

The presentation will highlight the importance of semantic enrichment processes in the geospatial and mapping domain and for providing more understandable data to end-users and policy-makers. Available learning-based methods, open issues in point cloud classification and recent progress will be explored over urban and forestry scenarios.

How to cite: Bayrak, O. C., Zhenyu, M., Farella, E. M., and Remondino, F.: Operationalize large-scale point cloud classification: potentials and challenges, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-10034,, 2024.

EGU24-11571 | Posters on site | ESSI1.4

Simple temporal domain adaptation techniques for mapping the inter-annual dynamics of smallholder-dominated croplands over large extents 

Lyndon Estes, Sam Khallaghi, Rahebe Abedi, Mary Asipunu, Nguyen Ha, Boka Luo, Cat Mai, Amos Wussah, Sitian Xiong, and Yao-Ting Yao

Tracking how agricultural systems are changing is critical to answering important questions related to socioeconomic (e.g. food security) and environmental sustainability (e.g. carbon emissions), particularly in rapidly changing regions such as Africa. Monitoring agricultural dynamics requires satellite-based approaches that can accurately map individual fields at frequent (e.g. annual) intervals over national to regional extents, yet mapping Africa's smallholder-dominated agricultural systems is difficult, as the small and indistinct nature of fields promotes mapping error, while frequent cloud cover leads to coverage gaps. Fortunately, the increasing availability of high spatio-temporal resolution imagery and the growing capabilities of deep learning models now make it possible to accurately map crop fields over large extents. However, the ability to make consistently reliable maps for more than one time point remains difficult, given the substantial domain shift between images collected in different seasons or years, which arises from variations in atmospheric and land surface conditions, and results in less accurate maps for times beyond those for which the model was trained. To cope with this domain shift, a model's parameters can be adjusted through fine-tuning on training data from the target time period, but collecting such data typically requires manual annotation of images, which is expensive and often impractical. Alternatively, the approach used to develop the model can be adjusted to improve its overall generalizability. Here we show how combining several fairly standard architectural and input techniques, including careful selection of the image normalization method, increasing the model's width, adding regularization techniques, using modern optimizers, and choosing an appropriate loss function, can significantly enhance the ability of a convolutional neural network to generalize across time, while eliminating the need to collect additional labels. A key component of this approach is the use of Monte Carlo dropout, a regularization technique applied during inference that provides a measure of model uncertainty while producing more robust predictions. We demonstrate this procedure by training an adapted U-Net, a widely used encoder-decoder architecture, with a relatively small number of labels (~5,000 224X224 image chips) collected from 3 countries on 3.7 m PlanetScope composite imagery collected primarily in 2018, and use the model, without fine-tuning, to make reliable maps of Ghana's  (240,000 km2) annual croplands for the years 2018-2023 on 4.8 m Planet basemap mosaics. We further show how this approach helps to track agricultural dynamics by providing a country-wide overview of cropping frequency, while highlighting hotspots of cropland expansion and intensification during the 6-year time period (2018-2023).

How to cite: Estes, L., Khallaghi, S., Abedi, R., Asipunu, M., Ha, N., Luo, B., Mai, C., Wussah, A., Xiong, S., and Yao, Y.-T.: Simple temporal domain adaptation techniques for mapping the inter-annual dynamics of smallholder-dominated croplands over large extents, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11571,, 2024.

EGU24-12472 | ECS | Posters on site | ESSI1.4

Building hierarchical use classification based on multiple data sources with a multi-label multimodal transformer network 

Wen Zhou, Claudio Persello, and Alfred Stein

Effective urban planning, city digital twins, and informed policy formulation rely heavily on precise building use information. While existing research often focuses on broad categories of building use, there is a noticeable gap in the classification of buildings’ detailed use. This study addresses this gap by concurrently extracting both broad and detailed hierarchical information regarding building use. Our approach involves leveraging multiple data sources, including high spatial resolution remote sensing images (RS), digital surface models (DSM), street view images (SVI), and textual information from point of interest (POI) data. Given the complexity of mixed-use buildings, where different functions coexist, we treat building hierarchical use classification as a multi-label task, determining the presence of specific categories within a building. To maximize the utility of features across diverse modalities and their interrelationships, we introduce a novel multi-label multimodal Transformer-based feature fusion network. This network can simultaneously predict four broad categories and thirteen detailed categories, representing the first instance of utilizing these four modalities for building use classification. Experimental results demonstrate the effectiveness of our model, achieving a weighted average F1 score (WAF) of 91% for broad categories, 77% for detailed categories, and 84% for hierarchical categories. The macro average F1 scores (MAF) are 81%, 48%, and 56%, respectively. Ablation experiments highlight RS data as the cornerstone for hierarchical building use classification. DSM and POI provide slight supplementary information, while SVI data may introduce more noise than effective information. Our analysis of hierarchy consistency, supplementary, and exclusiveness between broad and detailed categories shows our model can effectively learn these relations. We compared two ways to obtain broad categories: classifying them directly and scaling up detailed categories, associating them with their broad counterparts. Experiments show that the WAF and MAF of the former are 3.8% and 6% higher than the latter. Notably, our research visualizes attention models for different modalities, revealing the synergy among them. Despite the model’s emphasis on SVI and POI data, the critical role of RS and DSM in building hierarchical use classification is underscored. By considering hierarchical use categories and accommodating mixed-use scenarios, our method provides more accurate and comprehensive insights into land use patterns.

How to cite: Zhou, W., Persello, C., and Stein, A.: Building hierarchical use classification based on multiple data sources with a multi-label multimodal transformer network, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12472,, 2024.

EGU24-12745 | ECS | Posters on site | ESSI1.4

Mapping the extent and land use intensity of shifting cultivation with Planet Scope imagery and deep learning in the Democratic Republic of Congo  

Wanting Yang, Xiaoye Tong, Sizhuo Li, Daniel Ortiz Gonzalo, and Rasmus Fensholt

Shifting cultivation, in which primary or secondary forest plots are converted into agriculture for one to two years and then left fallow, is often deemed responsible for tropical deforestation. However, the general attribution of deforestation to areas under shifting cultivation is debatable if considering also the component of forest regrowth during the fallow phase, which is an essential part of a mature shifting cultivation system. Yet, little is known about the extent of small-size cropped fields and fallow stages, which are needed to derive information about the temporal development between small-size cropped fields and fallow in shifting cultivation landscapes.

The primary objective of our study is to develop a deep learning-based framework to quantify land use intensity in tropical forest nations such as the Democratic Republic of Congo (DRC) using 4.7-m multi-temporal Planet Basemaps from 2015 to 2023. By employing a convolutional neural network image classification model, we first identified the shifting cultivation landscapes. Secondly, utilizing two-phase imagery, we delve into the temporal development of shifting cultivation, determining whether the landscape continues to be characterized by this practice. Thirdly, the shifting cultivation landscapes were segmented into cropped fields, young fallow, old fallow and old-growth forest/primary forest. Lastly, we used a deep learning regression model to quantify the intensity of shifting cultivation within identified areas. This last step adds depth to our analysis, by offering nuanced insights into the varying practices associated with shifting cultivation practices. Our study in DRC offers a detailed spatio-temporal dataset of the dynamics of shifting cultivation serving as a stepping stone to better understand its impacts on forest loss.

How to cite: Yang, W., Tong, X., Li, S., Ortiz Gonzalo, D., and Fensholt, R.: Mapping the extent and land use intensity of shifting cultivation with Planet Scope imagery and deep learning in the Democratic Republic of Congo , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12745,, 2024.

EGU24-13166 | ECS | Orals | ESSI1.4

Remote sensing techniques for habitat condition mapping: deadwood monitoring using airborne laser scanning data 

Agata Walicka, Jesper Bladt, and Jesper Erenskjold Moeslund

Deadwood is a vital part of the habitat for many threatened species of animals, plants and fungi. Thus, presence of deadwood is an important indicator for the probability that a given site harbors threatened species. Nowadays, field work is the most common method for monitoring dead trees. However, it is time consuming, costly and labor-intensive. Therefore, there is a need for an automatic method for mapping and monitoring deadwood. The combination of fine-resolution remote sensing and deep learning techniques have a potential to provide exactly this. Unfortunately, due to the typical location of lying deadwood under the canopy, this is a challenging task as the visibility of the lying trees is limited notably with optical remote sensing techniques. Therefore, laser scanning data seems to be the most appropriate for this purpose as it can penetrate the canopy to some extent and hence gather data from a forest floor.

In this work we aim at the development of methods enabling detection of lying deadwood at the national scale in protected forests and we focus on the presence of deadwood in 15-meter-radius circular plots. To achieve this goal, we use Airborne Laser Scanning (ALS) data that is publicly available for the whole Denmark and, as a reference, almost 6000 forestry plots acquired as a part of the Danish national habitats monitoring program. The binary classification into plots that contain deadwood and the ones that do not is performed using SparseCNN deep neural network. In this study we showed that it is possible to detect plots having deadwood with an overall accuracy of around 61%. However, the accuracy of the classifier depends on the volume of the deadwood present in a plot.  

How to cite: Walicka, A., Bladt, J., and Moeslund, J. E.: Remote sensing techniques for habitat condition mapping: deadwood monitoring using airborne laser scanning data, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13166,, 2024.

EGU24-15347 | ECS | Posters on site | ESSI1.4

Estimating Crop Phenology from Satellite Data using Machine Learning 

Shahabaldin Shojaeezadeh, Abdelrazek Elnashar, and Tobias Karl David Weber

Monitoring crop growth and development is important for agricultural management and policy interventions enhancing food security worldwide. Traditional methods of examining crop phenology (the timing of growth stages in plants) at large scales often are not sufficiently accurate to make informed decisions about crops. In this study, we proposed an approach that uses a satellite data fusion and Machine Learning (ML) modeling framework to predict crop phenology for eight major crops at field scales (30 meter) across all of Germany. The observed phenology used in this study is based on the citizen science data set of phenological observations covering all of Germany. By fusing satellite data from Landsat and Sentinel-2 images with radar data from Sentinel-1, our method effectively captures information from each publicly available Remote Sensing data source, resulting in precise estimations of phenology timing. Through a fusion analysis, results indicated that combining optical and radar images improves ML model ability to predict phenology with high accuracies with R2 > 0.95 and a mean absolute error of less than 2 days for all the crops. Further analysis of uncertainties confirmed that adding radar data together with optical images improves the modeling reliability of satellite-based predictions of crop phenology. These improvements are expected to be useful for crop model calibrations, facilitate informed agricultural decisions, and contribute to sustainable food production to address the increasing global food demand.

How to cite: Shojaeezadeh, S., Elnashar, A., and Weber, T. K. D.: Estimating Crop Phenology from Satellite Data using Machine Learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15347,, 2024.

EGU24-15565 | Orals | ESSI1.4

Identification of asbestos roofing from hyperspectral images 

Elena Viero, Donatella Gubiani, Massimiliano Basso, Marco Marin, and Giovanni Sgrazzutti

Regulations relating to the disposal of the use of asbestos was introduced in Italy with Law no. 257 of 1992 and its implementation took place over time. The Regional Asbestos Plan was put in place in 1996 and is updated periodically.

Modern remote sensing techniques constitute an essential tool for studies over an environmental and territorial scale. These systems can detect for each pixel of the acquired image from tens to hundreds of bands of the electromagnetic spectrum. This is useful as any material has its own characteristic spectral signature that can be exploited for different types of investigation.

The work involved the experimentation of a neural network for the classification of airborne remotely sensed hyperspectral images to identify and map the asbestos-cement roofing existing in some Municipalities of the Autonomous Region of Friuli Venezia Giulia.

The Region covers an area of approximately 8,000 square kilometres. To detect the entire area, it was necessary to carry out flights on different directions, different days and with different solar exposure conditions and so, the radiometric quality of the images is not uniform. Moreover, the images have high geometric resolution (1 meter pixel) and radiometric resolution (over 180 bands), that required a particular attention in their management: more than 4,000 images, for a total size of 25-30 TB.

Starting from these hyperspectral images and using the information already available relating to the mapping of the asbestos roofs of 25 Municipalities of the Region, we generated an adequate ground truth to train, test and validate a neural network implemented using the Keras library.

Given the differences in the territories of the various Municipalities, in the first step of the processing we calculated 3 different models generated on different training datasets for each considered Municipality: a total and a partial one that are independent on the considered Municipality, and the last one adapted to the specific Municipality. The combination of these predictions allowed us to obtain a raster result which is supposed to better adapt to the characteristics of the considered Municipality.

Obtained the data, it was then necessary to move on from the raster results to vector data using a zonal analysis on the buildings available in the Regional Numerical Map. An initial automatic classification, determined through the definition of adequate thresholds, was then manually refined exploring it with additional tools, such as Google StreetView and the 10 cm regional orthophoto, to obtain a final refined classification.

The results obtained for the 5 pilot Municipalities represent a clear indication of the presence of asbestos material on some building roofs. This work emphasized an operational workflow using data at a regional scale and could also be easily extended to other territorial entities. It has the great advantage to allow the government authority to save at least an order of magnitude in term of costs with respect to traditional investigations. Finally, the automation of the neural network represents a useful tool for programming, planning and management of the territory also in terms of human health.

How to cite: Viero, E., Gubiani, D., Basso, M., Marin, M., and Sgrazzutti, G.: Identification of asbestos roofing from hyperspectral images, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15565,, 2024.

EGU24-18546 | Posters on site | ESSI1.4

A novel index for forest structure complexity mapping from single multispectral images 

Xin Xu, Xiaowei Tong, Martin Brandt, Yuemin Yue, Maurice Mugabowindekwe, Sizhuo Li, and Rasmus Fensholt

Due to the provisioning of essential ecosystem goods and services by forests, the monitoring of forests has attracted considerable attention within the academic community. However, the majority of remote sensing studies covering large areas primarily focus on tree cover due to resolution limitations. It is necessary to integrate innovative spatial methods and tools in the monitoring of forest ecosystems. Forest Structure Complexity, representing the spatial heterogeneity within forest structures, plays a pivotal role in influencing ecosystem processes and functions. In this study, we use multi-spectral remote sensing image data to extract the crown information of the single tree through deep learning technology; Subsequently, we analyze the relationship between each single tree and its neighboring trees, and explore the structural characteristics at tree level. Finally, we developed the canopy structural complexity index and applied it to Nordic forests, urban areas, savanna, rainforest, and the most complex tree plantations and natural forests in China Karst. This study aims to gain a deeper understanding of the forest structure complexity in diverse ecosystems and provide valuable information for sustainable forestry management and ecosystem conservation. The method developed in this study eliminates the need  for additional field measurement and radar data, offering robust tool support for extensive and efficient the monitoring of forest structure complexity, which has a wide application prospect.

How to cite: Xu, X., Tong, X., Brandt, M., Yue, Y., Mugabowindekwe, M., Li, S., and Fensholt, R.: A novel index for forest structure complexity mapping from single multispectral images, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18546,, 2024.

EGU24-18679 | ECS | Posters on site | ESSI1.4

Large-scale satellite mapping unveils uneven wetland restoration needs across Europe 

Gyula Mate Kovács, Xiaoye Tong, Dimitri Gominski, Stefan Oehmcke, Stéphanie Horion, and Rasmus Fensholt

Wetlands are crucial carbon sinks for climate change mitigation, yet historical land use changes have resulted in carbon losses and increased CO2 emissions. To combat this, the European Union aims to restore 30% of degraded wetlands in Europe by 2030. However, comprehensive continental-scale inventories are essential for prioritizing restoration and assessing high carbon stock wetlands, revealing the inadequacy of existing datasets. Leveraging 10-meter satellite data and machine learning, our study achieved 94±0.5% accuracy in mapping six wetland types across Europe in 2018. Our analysis identifies that over 40% of European wetlands experience anthropogenic disturbances, with 32.7% classified as highly disturbed due to urban and agricultural activities. Country-level assessments highlight an uneven distribution of restoration needs, emphasizing the urgent importance of data-informed approaches for meaningful restoration. This study underscores the critical need to address land use impact to preserve and enhance wetland carbon storage capabilities.

How to cite: Kovács, G. M., Tong, X., Gominski, D., Oehmcke, S., Horion, S., and Fensholt, R.: Large-scale satellite mapping unveils uneven wetland restoration needs across Europe, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18679,, 2024.

EGU24-19198 | ECS | Posters on site | ESSI1.4

Learning-Based Hyperspectral Image Compression Using A Spatio-Spectral Approach 

Niklas Sprengel, Martin Hermann Paul Fuchs, and Prof. Begüm Demir

Advances in hyperspectral imaging have led to a significant increase in the vol-
ume of hyperspectral image archives. Therefore, the development of efficient and
effective hyperspectral image compression methods is an important research topic in
remote sensing. Recent studies show that learning-based compression methods are
able to preserve the reconstruction quality of images at lower bitrates compared to
traditional methods [1]. Existing learning-based image compression methods usu-
ally employ spatial compression per image band or for all bands jointly. However,
hyperspectral images contain a high amount of spectral correlations which neces-
sitates more complex compression architectures that can reduce both spatial and
spectral correlations for a more efficient compression. To address this problem, we
propose a novel Spatio-Spectral Compression Network (S2C-Net).
S2C-Net is a flexible architecture to perform hyperspectral image compression,
exploiting both spatial and spectral dependencies of hyperspectral images. It com-
bines different spectral and spatial autoencoders into a joint model. To this end, a
learning-based pixel-wise spectral autoencoder is initially pre-trained. Then, a spa-
tial autoencoder network is added into the bottleneck of the spectral autoencoder for
further compression of the spatial correlations. This is done by applying the spatial
autoencoder to the output of the spectral encoder and then applying the spectral
decoder to the output of the spatial autoencoder. The model is then trained using
a novel mixed loss function that combines the loss of the spectral and the spatial
model. Since the spatial model is applied on the output of the spectral encoder,
the spatial compression methods that are optimised for 2D image compression can
be used in S2C-Net in the context of hyperspectral image compression.
In the experiments, we have evaluated our S2C-Net on HySpecNet-11k that is
a large-scale hyperspectral image dataset [2]. Experimental results show that S2C-
Net outperforms both spectral and spatial state of the art compression methods for
bitrates lower than 1 bit per pixel per channel (bpppc). Specifically, it can achieve
lower distortion for similar compression rates and offers the possibility to reach
much higher compression rates with only slightly reduced reconstruction quality.

[1] F. Zhang, C. Chen, and Y. Wan, “A survey on hyperspectral remote sensing
image compression,” in IEEE IGARSS, 2023, pp. 7400–7403..
[2] M. H. P. Fuchs and B. Demir, “Hyspecnet-11k: A large-scale hyperspectral
dataset for benchmarking learning-based hyperspectral image compression meth-
ods,” in IEEE IGARSS, 2023, pp. 1779–1782.

How to cite: Sprengel, N., Fuchs, M. H. P., and Demir, P. B.: Learning-Based Hyperspectral Image Compression Using A Spatio-Spectral Approach, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19198,, 2024.

EGU24-20985 | Orals | ESSI1.4

GeoAI advances in specific landform mapping 

Samantha Arundel, Michael Von Pohle, Ata Akbari Asanjan, Nikunj Oza, and Aaron Lott

Landform mapping (also referred to as geomorphology or geomorphometry) can be divided into two domains: general and specific (Evans 2012). Whereas general landform mapping categorizes all elements of the study area into landform classes, such as ridges, valleys, peaks, and depressions, the mapping of specific landforms requires the delineation (even if fuzzy) of individual landforms. The former is mainly driven by physical properties such as elevation, slope, and curvature.  The latter, however, must consider the cognitive (human) reasoning that discriminates individual landforms in addition to these physical properties (Arundel and Sinha 2018).

Both mapping forms are important. General geomorphometry is needed to understand geological and ecological processes and as boundary layer input to climate and environmental models. Specific geomorphometry supports such activities as disaster management and recovery, emergency response, transportation, and navigation.

In the United States, individual landforms of interest are named in the U.S. Geological Survey (USGS) Geographic Names Information System, a point dataset captured specifically to digitize geographic names from the USGS Historical Topographic Map Collection (HTMC). Named landform extent is represented only by the name placement in the HTMC.

Recent work has investigated CNN-based deep learning methods to capture these extents in machine-readable form. These studies first relied on physical properties (Arundel et al. 2020) and then included the HTMC as a band in RGB images in limited testing (Arundel et al. 2023). Results from the HTMC dataset surpassed those using just physical properties. The HTMC alone performed best due to the hillshading and elevation (contour) data incorporated into the topographic maps. However, results fell short of an operational capacity to map all named landforms in the United States. Thus, our current work expands upon past research by focusing on the HTMC and physical information as inputs and the named landform label extents.

Specifically, we propose to leverage pre-trained foundation models for segmentation and optical character recognition (OCR) models to jointly map landforms in the United States. Our approach aims to bridge the disparities among independent information sources to facilitate informed decision-making. The modeling pipeline performs (1) segmentation using the physical information and (2) information extraction using OCR in parallel. Then, a computer vision approach merges the two branches into a labeled segmentation. 


Arundel, Samantha T., Wenwen Li, and Sizhe Wang. 2020. “GeoNat v1.0: A Dataset for Natural Feature Mapping with Artificial Intelligence and Supervised Learning.” Transactions in GIS 24 (3): 556–72.

Arundel, Samantha T, and Gaurav Sinha. 2018. “Validating GEOBIA Based Terrain Segmentation and Classification for Automated Delineation of Cognitively Salient Landforms In Proceedings of Workshops and Posters at the 13th International Conference on Spatial Information Theory (COSIT 2017), Lecture Notes in Geoinformation and Cartography, edited by Paolo Fogliaroni, Andrea Ballatore, and Eliseo Clementini, 9–14. Cham: Springer International Publishing.

Arundel, Samantha T., Gaurav Sinha, Wenwen Li, David P. Martin, Kevin G. McKeehan, and Philip T. Thiem. 2023. “Historical Maps Inform Landform Cognition in Machine Learning.” Abstracts of the ICA 6 (August): 1–2.

Evans, Ian S. 2012. “Geomorphometry and Landform Mapping: What Is a Landform?” Geomorphology 137 (1): 94–106.

How to cite: Arundel, S., Von Pohle, M., Akbari Asanjan, A., Oza, N., and Lott, A.: GeoAI advances in specific landform mapping, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20985,, 2024.

EGU24-420 | ECS | Orals | ESSI1.5

Neural Networks for Surrogate Models of the Corona and Solar Wind 

Filipa Barros, João José Graça Lima, Rui F. Pinto, and André Restivo

In previous work, an Artificial Neural Network (ANN) was developed to automate the estimation of solar wind profiles used as initial conditions in MULTI-VP simulations. This approach, coupled with profile clustering, reduced the time previously required for estimation by MULTI-VP, enhancing the efficiency of the simulation process. It was observed that generating initial estimates closer to the final simulation led to reduced computation time, with a mean speedup of 1.13. Additionally, this adjustment yielded a twofold advantage: it minimized the amplitude of spurious transients, reinforcing the numerical stability of calculations and enabling the code to maintain a more moderate integration time step.

However, upon further analysis, it became evident that the physical model inherently required a relaxation time for the final solution to stabilize. Therefore, while refining initial conditions offered improvements, there was a limit to how much it could accelerate the process. Consequently, attention turned towards the development of a surrogate model focused on the upper corona (from 3 solar radii to 30 solar radii). This range was chosen because the model can avoid learning the initial phases of wind acceleration, which are hard to accurately predict. Moreover, in order to connect the model to heliospheric models and for space weather applications, more than 3 radii is more than sufficient and guarantees that the physics remain consistent within the reproducible domain.

This surrogate model aims at delivering faster forecasts, with MULTI-VP running in parallel (eventually refining the solutions). The surrogate model for MULTI-VP was tested using a heliospheric model and data from spacecraft at L1, validating its efficacy beyond Mean Squared Error (MSE) evaluations and ensuring physical conservation principles were upheld.

This work aims at simplifying and accelerating the process of establishing boundary conditions for heliospheric models without dismissing the physical models for both extreme events and for more physically accurate results. 

How to cite: Barros, F., Lima, J. J. G., F. Pinto, R., and Restivo, A.: Neural Networks for Surrogate Models of the Corona and Solar Wind, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-420,, 2024.

we show the evolutions of the separated strands within the apparent single coronal loops observed in Atmospheric Imaging Assembly (AIA) images. The loop strands are detected on  the upsampled AIA 193 equation.pdf images, which are   generated using a super-resolution convolutional neural  network, respectively. The architecture of the network is designed to map the AIA images to unprecedentedly high spatial resolution coronal images taken by  High-resolution Coronal Imager (Hi-C) during its brief flight. At some times, pairs of individual strands appeared to braid with each other and subsequently evolved to become pairs of almost parallel ones with their segments having exchanged totally.  These evolutions provide  morphological evidence supporting occurrences of magnetic reconnections between the braiding strands, which are further confirmed by  the occurrences of the transient hot emissions (>5 MK)  located at the footpoints of  the braiding structures. 

How to cite: Bi, Y.: The coronal braiding structures detected in the machine-learning upscaled SDO/AIA images, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1494,, 2024.

EGU24-1604 | ECS | Orals | ESSI1.5

Machine Learning Synthesis and inversion method for Stokes Parameters in the solar context 

Juan Esteban Agudelo Ortiz, Germain Nicolás Morales Suarez, Santiago Vargas Domínguez, and Sergiy Shelyag

The arrival of new and more powerful spectropolarimetric instruments such as DKIST, the development of better magnetohydrodinamic (MHD) simulation codes and the creation of newly inversion methods, are coming with the demands of increasing amounts of computational time and power. This, with increasing generation of data, will come with even years of processing that will stop the advance of scientific investigations on mid-late stages. The arrival of Machine Learning models able to replicate patterns in data come with the possibilites of them to adapt to different types of datasets, such as those for classification or for creation of sequences like the seq2seq models, that once trained, they are able to give results according to previous methods that differ on order of magnitude in time processing, being a lot faster. Some work has been done within this field for creating machine learning inversion methods using data obtained from actual inversion codes applied on observational data, and using data from radiative transfer codes for synthesis, reducing both computational demands and time processing. This work attempts to follow onto this steps, using in this case datasets obtained from simulation codes like MURaM and their correspondent Stokes parameters obtained from non-lte radiative transfer codes like NICOLE, training forward (synthesis) and backward (inversion) some neural network models to test whether or not they can learn their physical behaviours and at what accuracy, for being used in the future to process actual data obtained from newly simulation codes and for real solar observations, being another step into the future for creating a new paradigm on how to invert and sunthesize quantities in Physics in general.

How to cite: Agudelo Ortiz, J. E., Morales Suarez, G. N., Vargas Domínguez, S., and Shelyag, S.: Machine Learning Synthesis and inversion method for Stokes Parameters in the solar context, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1604,, 2024.

EGU24-2046 | ECS | Posters on site | ESSI1.5

Comparative Analysis of Random Forest and XGBoost in Classifying Ionospheric Signal Disturbances During Solar Flares 

Filip Arnaut, Aleksandra Kolarski, and Vladimir Srećković

In our previous publication (Arnaut et al. 2023), we demonstrated the application of the Random Forest (RF) algorithm for classifying disturbances associated with solar flares (SF), erroneous signals, and measurement errors in VLF amplitude data i.e., anomaly detection in VLF amplitude data. The RF algorithm is widely regarded as a preferred option for conducting research in novel domains. Its advantages, such as its ability to avoid overfitting data and its simplicity, make it particularly valuable in these situations. Nevertheless, it is imperative to conduct thorough testing and evaluation of alternative algorithms and methods to ascertain their potential advantages and enhance the overall efficiency of the method. This brief communication demonstrates the application of the XGBoost (XGB) method on the exact dataset previously used for the RF algorithm, along with a comparative analysis between the two algorithms. Given that the problem is framed as a machine learning (ML) problem with a focus on the minority class, the comparative analysis is exclusively conducted using the minority (anomalous) data class. The data pre-processing methodology can be found in Arnaut et al. (2023). The XGB tuning process involved using a grid search method to optimize the hyperparameters of the model. The number of estimators (trees) was varied from 25 to 500 in increments of 25, and the learning rate was varied from 0.02 to 0.4 in increments of 0.02. The F1-Score for the anomalous data class is similar for both models, with a value of 0.508 for the RF model and 0.51 for the XGB model. These scores were calculated using the entire test dataset, which consists of 19 transmitter-receiver pairs. Upon closer examination, it becomes evident that the RF model exhibits a higher precision metric (0.488) than the XGB model (0.37), while the XGB model demonstrates a higher recall metric (0.84) compared to the RF model (0.53). Upon examining each individual transmitter-receiver pair, it was found that XGB outperformed RF in terms of F1-Scores in 10 out of 19 cases. The most significant disparities are observed in cases where the XGB model outperformed by a margin of 0.15 in terms of F1-Score, but conversely performed worse by approximately -0.16 in another instance for the anomalous data class. The XGB models outperformed the RF model by approximately 6.72% in terms of the F1-score for the anomalous data class when averaging all the 19 transmitter-receiver pairs. When utilizing a point-based evaluation metric that assigns rewards or penalties for each entry in the confusion matrix, the RF model demonstrates an overall improvement of approximately 5% compared to the XGB model. Overall, the comparison between the RF and XGB models is ambiguous. Both models have instances where one is superior to the other. Further research is necessary to fully optimize the method, which has benefits in automatically classifying VLF amplitude anomalous signals caused by SF effects, erroneous measurements, and other factors.


Arnaut, F., Kolarski, A. and Srećković, V.A., 2023. Random Forest Classification and Ionospheric Response to Solar Flares: Analysis and Validation. Universe9(10), p.436.

How to cite: Arnaut, F., Kolarski, A., and Srećković, V.: Comparative Analysis of Random Forest and XGBoost in Classifying Ionospheric Signal Disturbances During Solar Flares, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2046,, 2024.

EGU24-4181 | Posters on site | ESSI1.5

Prediction of sunspot number using Gaussian processes 

Everton Frigo and Italo Gonçalves

The solar activity has various direct and indirect impacts on human activities. During periods of high solar activity, the harmful effects triggered by solar variability are maximized. On a decadal to multidecadal time scale, solar variability exhibits a main cycle of around 11 years known as the Schwabe solar cycle, leading to a solar maximum approximately every 11 years. The most commonly used variable for measuring solar activity is the sunspot number. Over the last few decades, numerous techniques have been employed to predict the time evolution of the solar cycle for subsequent years. Recently, there has been a growing number of studies utilizing machine learning methods to predict solar cycles. One such method is the Gaussian process, which is well-suited for working with small amounts of data and can also provide an uncertainty measure for predictions. In this study, the Gaussian process technique is employed to predict the sunspot number between 2024 and 2050. The dataset used to train and validate the model comprises monthly averages of sunspots relative to the period 1700-2023. According to the results, the current solar cycle, currently at its maximum, is anticipated to last until 2030. The subsequent solar maximum is projected to occur around the end of 2033, with an estimated maximum sunspot number of approximately 150. If this prediction holds true, the next solar cycle's maximum will resemble that observed in the current one.

How to cite: Frigo, E. and Gonçalves, I.: Prediction of sunspot number using Gaussian processes, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4181,, 2024.

EGU24-4471 | ECS | Orals | ESSI1.5

Solar Wind Speed Estimation via Symbolic Knowledge Extraction from Opaque Models 

Federico Sabbatini and Catia Grimani

The unprecedented predictive capabilities of machine learning models make them inestimable tools to perform data forecasting and other complex tasks. Benefits of these predictors are even more precious when there is the necessity of surrogating unavailable data due to the lack of dedicated instrumentation on board space missions. For instance, the future ESA space interferometer LISA for low-frequency gravitational wave detection will host, as part of its diagnostics subsystem, particle detectors to measure the galactic cosmic-ray flux and magnetometers to monitor the magnetic field intensity in the region of the interferometer mirrors. No instrumentation dedicated to the interplanetary medium parameter monitoring will be placed on the three spacecraft constituting the LISA constellation. However, important lessons about the correlation between galactic cosmic-ray flux short-term variations and the solar wind speed profile have been learned with the ESA LISA precursor mission, LISA Pathfinder, orbiting around the L1 Lagrange point. In a previous work, we have demonstrated that for LISA Pathfinder it was possible to reconstruct with an uncertainty of 2 nT the interplanetary magnetic field intensity for interplanetary structure transit monitoring. Machine learning models are proposed here to infer the solar wind speed that is not measured on the three LISA spacecraft from galactic cosmic-ray measurements. This work is precious and necessary since LISA, scheduled to launch in 2035, will trail Earth on the ecliptic at 50 million km distance, too far from the orbits of other space missions dedicated to the interplanetary medium monitoring to benefit of their observations.

We built an interpretable machine learning predictor based on galactic cosmic-ray and interplanetary magnetic field observations to obtain a solar wind speed reconstruction within ±65 km s-1 of uncertainty. Interpretability is achieved by applying the CReEPy symbolic knowledge extractor to the outcomes of a k-NN regressor. The extracted knowledge consists of linear equations aimed at describing the solar wind speed in terms of four statistical indices calculated for the input variables.

Details about the model workflow, performance and validation will be presented at the conference, together with the advantages, drawbacks and possible future enhancements, to demonstrate that our model may provide the LISA mission with an effective and human-interpretable tool to carry out reliable solar wind speed estimates and recognise the transit of interplanetary structures nearby the LISA spacecraft, as a support to the data analysis activity for the monitoring of the external forces acting on the spectrometer mirrors.

How to cite: Sabbatini, F. and Grimani, C.: Solar Wind Speed Estimation via Symbolic Knowledge Extraction from Opaque Models, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4471,, 2024.

EGU24-6558 | ECS | Orals | ESSI1.5

A New Machine Learning Approach for Predicting Extreme Space Weather 

Andong Hu and Enrico Camporeale

We present an innovative method, ProBoost (Probabilistic Boosting), for forecasting extreme space weather events using ensemble machine learning (ML). Ensembles enhance prediction accuracy, but applying them to ML faces challenges as ML models often lack wellcalibrated uncertainty estimates. Moreover, space weather problems are typically affected by very imbalanced datasets (i.e., extreme and rare events) To overcome these difficulties, we developed a method that incorporates uncertainty quantification (UQ) in neural networks, enabling simultaneous forecasting of prediction uncertainty.
Our study applies ProBoost to the following space weather applications:
• One-to-Six-Hour Lead-Time Model: Predicting Disturbance Storm Time (Dst) values using solar wind data.
• Two-Day Lead-Time Model: Forecasting Dst probability using solar images.
• Geoelectric Field Model: Multi-hour lead time, incorporating solar wind and SuperMag data.
• Ambient Solar Wind Velocity Forecast: Up to 5 days ahead.
ProBoost is model-agnostic, making it adaptable to various forecasting applications beyond space weather.

How to cite: Hu, A. and Camporeale, E.: A New Machine Learning Approach for Predicting Extreme Space Weather, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6558,, 2024.

We use the framework of Physics-Informed Neural Network (PINN) to solve the inverse problem associated with the Fokker-Planck equation for radiation belts' electron transport, using 4 years of Van Allen Probes data. Traditionally, reduced models have employed a diffusion equation based on the quasilinear approximation. We show that the dynamics of “killer electrons” is described more accurately by a drift-diffusion equation, and that drift is as important as diffusion for nearly-equatorially trapped ∼1 MeV electrons in the inner part of the belt. Moreover, we present a recipe for gleaning physical insight from solving the ill-posed inverse problem of inferring model coefficients from data using PINNs. Furthermore, we derive a parameterization for the diffusion and drift coefficients as a function of L only, which is both simpler and more accurate than earlier models. Finally, we use the PINN technique to develop an automatic event identification method that allows identifying times at which the radial transport assumption is inadequate to describe all the physics of interest.

How to cite: Camporeale, E.: Data-Driven Discovery of Fokker-Planck Equation for the Earth's Radiation Belts Electrons Using Physics-Informed Neural Networks, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6899,, 2024.

The detection of asteroids involves the processing of sequences of astronomical images. The main challenges arise from the huge volume of data that should be processed in a reasonable amount of time. To address this, we developed the NEARBY platform [1], [2] for efficiently automatic detection of asteroids in sequence of astronomical images. This platform encompasses multidimensional data processing capabilities, human-verified visual analysis, and cloud-based adaptability. This paper outlines the enhancements we have made to this automated asteroid detection system by integrating a machine learning-based classifier known as the CERES module. The integration of the CERES module [3] into the NEARBY platform substantially enhances its performance by automatically reducing the number of false positive detections. Consequently, this leads to a more reliable and efficient system for asteroid identification, while also reducing the time and effort required by human experts to validate detected candidates (asteroids). The experiments highlight these improvements and their significance in advancing the field of asteroid tracking. Additionally, we explore the applicability of the asteroid classification model, initially trained using images from a specific telescope, across different telescopes.


  • This work was supported by a grant of the Romanian Ministry of Education and Research, CCCDI - UEFISCDI, project number PN-III-P2-2.1-PED-2019-0796, within PNCDI III. (the development of the dataset and CNN models)
  • This research was partially supported by the project 38 PFE in the frame of the programme PDI-PFE-CDI 2021.


  • Bacu, V., Sabou, A., Stefanut, T., Gorgan, D., Vaduvescu, O., NEARBY platform for detecting asteroids in astronomical images using cloud-based containerized applications, 2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 371-376
  • Stefanut, T., Bacu, V., Nandra, C., Balasz, D., Gorgan, D., Vaduvescu, O., NEARBY Platform: Algorithm for automated asteroids detection in astronomical images, 2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 365-369
  • Bacu, V.; Nandra, C.; Sabou, A.; Stefanut, T.; Gorgan, D. Assessment of Asteroid Classification Using Deep Convolutional Neural Networks. Aerospace 2023, 10, 752.


How to cite: Bacu, V.: Enhancement of the NEARBY automated asteroid detection platform with a machine learning-based classifier, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8018,, 2024.

EGU24-9174 | ECS | Orals | ESSI1.5

Enhancing Space Mission Return through On-Board Data Reduction using Unsupervised Machine Learning 

Salome Gruchola, Peter Keresztes Schmidt, Marek Tulej, Andreas Riedo, Klaus Mezger, and Peter Wurz

The efficient use of the provided downlink capacity for scientific data is a fundamental aspect of space exploration. The use thereof can be optimised through sophisticated data reduction techniques and automation of processes on board that otherwise require interaction with the operations centres on Earth. Machine learning-based autonomous methods serve both purposes; yet space-based ML applications remain relatively rare compared to the application of ML on Earth to data acquired in space.

In this contribution, we present a potential application of unsupervised machine learning to cluster mass spectrometric data on-board a spacecraft. Data were acquired from a phoscorite rock [1] using a prototype of a laser ablation ionisation mass spectrometer (LIMS) for space research [2]. Two unsupervised dimensionality reduction algorithms, UMAP and densMAP [3,4], were employed to construct low-dimensional representations of the data. Clusters corresponding to different mineral phases within these embeddings were found using HDBSCAN [5]. The impact of data pre-processing and model parameter selection on the classification outcome was investigated through varying levels of pre-processing and extensive grid searches.

Both UMAP and densMAP effectively isolated major mineral phases present within the rock sample, but densMAP additionally found minor inclusions present only in a small number of mass spectra. However, densMAP exhibited higher sensitivity to data pre-processing, yielding lower scores for minimally treated data compared to UMAP. For highly processed data, both UMAP and densMAP exhibited high stability across a broad model parameter space.

Given that the data were recorded using a miniature mass spectrometric instrument designed for space flight, these methods demonstrate effective strategies for substantial reduction of data similarly to what is anticipated on future space missions. Autonomous clustering of data into groups of different chemical composition, followed by the downlink of a representative mass spectrum of each cluster, aids in identifying relevant data. Mission return can therefore be enhanced through the selective downlink of data of interest. As both UMAP and densMAP, coupled with HDBSCAN, are relatively complex algorithms compared to more traditional techniques, such as k-means, it is important to evaluate the benefits and drawbacks of using simpler methods on-board spacecraft.


[1] Tulej, M. et al., 2022,

[2] Riedo, A. et al., 2012,

[3] McInnes, L. et al., 2018,

[4] Narayan, A., et al., 2021,

[5] McInnes, L., et al., 2017,

How to cite: Gruchola, S., Keresztes Schmidt, P., Tulej, M., Riedo, A., Mezger, K., and Wurz, P.: Enhancing Space Mission Return through On-Board Data Reduction using Unsupervised Machine Learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-9174,, 2024.

EGU24-10715 | ECS | Posters on site | ESSI1.5

Physics-driven feature combination for an explainable AI approach to flare forecasting 

Margherita Lampani, Sabrina Guastavino, Michele Piana, Federico Benvenuto, and Anna Maria Massone

Typical supervised feature-based machine learning approaches to flare forecasting rely on descriptors extracted from magnetograms, as from Helioseismic and Magnetic Imager (HMI) images, and standardized before being used in the training phase of the machine learning pipeline. However, this artificial intelligence (AI) model does not take into account the physical nature of the features and their role in the plasma physics equations. This talk proposes to generate novel features according to simple physics-driven combinations of the original descriptors, and to show whether this original physically explainable AI model leads to a more predictive solar flare forecasting.

How to cite: Lampani, M., Guastavino, S., Piana, M., Benvenuto, F., and Massone, A. M.: Physics-driven feature combination for an explainable AI approach to flare forecasting, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-10715,, 2024.

EGU24-12885 | ECS | Posters on site | ESSI1.5

Finding Hidden Conjunctions in the Solar Wind 

Zoe Faes, Laura Hayes, Daniel Müller, and Andrew Walsh

This study aims to identify sets of in-situ measurements of the solar wind which sample the same volume of plasma at different times and locations as it travels through the heliosphere using ensemble machine learning methods. Multiple observations of a single volume of plasma by different spacecraft - referred to here as conjunctions - are becoming more frequent in the current “golden age of heliophysics research” and are key to characterizing the expansion of the solar wind. Specifically, identifying these related observations will enable us to test the current understanding of solar wind acceleration from the corona to the inner heliosphere with a more comprehensive set of measurements than has been used in previous analyses.

Using in-situ measurements of the background solar wind from Solar Orbiter, Parker Solar Probe, STEREO-A, Wind and BepiColombo, we identify a set of criteria based on features of magnetic field, velocity, density and temperature timeseries of known conjunctions and search for other instances for which the criteria are satisfied, to find previously unknown conjunctions. We use an ensemble of models, including random forests and recurrent neural networks with long short-term memory trained on synthetic observations obtained from magnetohydrodynamic simulations, to identify candidate conjunctions solely from kinetic properties of the solar wind. Initial results show a previously unidentified set of conjunctions between the spacecraft considered in this study. While this analysis has thus far only been performed on observations obtained since 2021 (start of Solar Orbiter science operations), the methods used here can be applied to other datasets to increase the potential for scientific return of existing and future heliophysics missions.

The modular scientific software built over the course of this research includes methods for the retrieval, processing, visualisation, and analysis of observational and synthetic timeseries of solar wind properties. It also includes methods for feature engineering and integration with widely used machine learning libraries. The software is available as an open-source Python package to ensure results can be easily reproduced and to facilitate further investigation of coordinated in-situ data in heliophysics.

How to cite: Faes, Z., Hayes, L., Müller, D., and Walsh, A.: Finding Hidden Conjunctions in the Solar Wind, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12885,, 2024.

EGU24-12961 | ECS | Posters on site | ESSI1.5

Physics-informed neural networks for advanced solar magnetic field extrapolations 

Robert Jarolim, Benoit Tremblay, Matthias Rempel, Julia Thalmann, Astrid Veronig, Momchil Molnar, and Tatiana Podladchikova

Physics-informed neural networks (PINNs) provide a novel approach for data-driven numerical simulations, tackling challenges of discretization and enabling seamless integration of noisy data and physical models (e.g., partial differential equations). In this presentation, we discuss the results of our recent studies where we apply PINNs for coronal magnetic field extrapolations of the solar atmosphere, which are essential to understand the genesis and initiation of solar eruptions and to predict the occurrence of high-energy events from our Sun.
We utilize our PINN to estimate the 3D coronal magnetic fields based on photospheric vector magnetograms and the force-free physical model. This approach provides state-of-the-art coronal magnetic field extrapolations in quasi real-time. We simulate the evolution of Active Region NOAA 11158 over 5 continuous days, where the derived time profile of the free magnetic energy unambiguously relates to the observed flare activity.
We extend this approach by utilizing multi-height magnetic field measurements and combine them in a single magnetic field model. Our evaluation shows that the additional chromospheric field information leads to a more realistic approximation of the solar coronal magnetic field. In addition, our method intrinsically provides an estimate of the height corrugation of the observed magnetograms.
We provide an outlook on our ongoing work where we use PINNs for global force-free magnetic field extrapolations. This approach enables a novel understanding of the global magnetic topology with a realistic treatment of current carrying fields.
In summary, PINNs have the potential to greatly advance the field of numerical simulations, accelerate scientific research, and enable advanced space weather monitoring.

How to cite: Jarolim, R., Tremblay, B., Rempel, M., Thalmann, J., Veronig, A., Molnar, M., and Podladchikova, T.: Physics-informed neural networks for advanced solar magnetic field extrapolations, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12961,, 2024.

EGU24-14186 | Posters on site | ESSI1.5

Near real-time construction of Solar Coronal Parameters based on MAS simulation by Deep Learning  

Sumiaya Rahman, Hyun-Jin Jeong, Ashraf Siddique, and Yong-Jae Moon

Magnetohydrodynamic (MHD) models provide a quantitative 3D distribution of the solar corona parameters (density, radial velocity, and temperature). However, this process is expensive and time-consuming. For this, we apply deep learning models to reproduce the 3D distribution of solar coronal parameters from 2D synoptic photospheric magnetic fields. We consider synoptic photospheric magnetic fields as an input to obtain 3D solar coronal parameters simulated by the MHD Algorithm outside a Sphere (MAS) from June 2010 to January 2023. Each parameter is individually trained using 150 deep learning models, corresponding to 150 solar radial distances ranging from 1 to 30 solar radii. Our study yields significant findings. Firstly, our model accurately reproduces 3D coronal parameter structures across the 1 to 30 solar radii range, demonstrating an average correlation coefficient value of approximately 0.96. Secondly, the 150 deep-learning models exhibit a remarkably shorter runtime (about 16 seconds for each parameter), with an NVIDIA Titan XP GPU, in comparison to the conventional MAS simulation time. As the MAS simulation is a regularization model, we may significantly reduce the simulation time by using our results as an initial magnetic configuration to obtain an equilibrium condition. In the future, we hope that the generated solar coronal parameters can be used for near real-time forecasting of heliospheric propagation of solar eruptions.

How to cite: Rahman, S., Jeong, H.-J., Siddique, A., and Moon, Y.-J.: Near real-time construction of Solar Coronal Parameters based on MAS simulation by Deep Learning , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14186,, 2024.

EGU24-15813 | ECS | Orals | ESSI1.5

Instrument-to-Instrument translation: An AI tool to intercalibrate, enhance and super-resolve solar observations 

Christoph Schirninger, Astrid Veronig, Robert Jarolim, J. Emmanuel Johnson, Anna Jungbluth, Richard Galvez, Lilli Freischem, and Anne Spalding

Various instruments are used to study the Sun, including ground-based observatories and space telescopes. These data products are constantly changing due to technological improvements, different instrumentation, or atmospheric effects. However, for certain applications such as ground-based solar image reconstruction or solar cycle studies, enhanced and combined data products are necessary.

We present a general AI tool called Instrument-to-Instrument (ITI; Jarolim et al. 2023) translation, which is capable of translating datasets between two different image domains. This approach enables instrument intercalibration, image enhancement, mitigation of quality degradations, and super-resolution across multiple wavelength bands. The tool is built on unpaired image-to-image translation, which enables a wide range of applications, where no spatial or temporal overlap is required between the considered datasets.

In this presentation, we highlight ITI as a general tool for Heliospheric applications and demonstrate its capabilities by applying it to data from Solar Orbiter/EUI, PROBA2/SWAP, and the Solar Dynamics Observatory/AIA in order to achieve a homogenous, machine-learning ready dataset that combines three different EUV imagers. 

The direct comparison of aligned observations shows the close relation of ITI-enhanced and real high-quality observations. The evaluation of light-curves demonstrates an improved inter-calibration.

ITI is provided open-source to the community  and can be easily applied to novel datasets and various research applications. 

This research is funded through a NASA 22-MDRAIT22-0018 award (No 80NSSC23K1045) and managed by Trillium Technologies, Inc (

How to cite: Schirninger, C., Veronig, A., Jarolim, R., Johnson, J. E., Jungbluth, A., Galvez, R., Freischem, L., and Spalding, A.: Instrument-to-Instrument translation: An AI tool to intercalibrate, enhance and super-resolve solar observations, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15813,, 2024.

EGU24-15981 | ECS | Posters on site | ESSI1.5

Addressing the closure problem using supervised Machine Learning 

Sophia Köhne, Brecht Laperre, Jorge Amaya, Sara Jamal, Simon Lautenbach, Rainer Grauer, Giovanni Lapenta, and Maria Elena Innocenti

When deriving fluid equations from the Vlasov equation for collisionless plasmas, one runs into the so-called closure problem: each equation for the temporal evolution of one particle moment (density, current, pressure, heat flux, …) includes terms depending on the next order moment. Therefore, when choosing to truncate the description at the nth order, one must approximate the terms related to the (n+1)th order moment included in the evolution equation for the nth order moment. The order at which the hierarchy is closed and the assumption behind the approximations used determine how accurately a fluid description can reproduce kinetic processes.

In this work, we aim at reconstructing specific particle moments from kinetic simulations, using as input the electric and magnetic field and the lower moments. We use fully kinetic Particle In Cell simulations, where all physical information is available, as the ground truth. The approach we present here uses supervised machine learning to enable a neural network to learn how to reconstruct higher moments from lower moments and fields.

Starting from the work of Laperre et al., 2022 we built a framework which makes it possible to train feedforward multilayer perceptrons on kinetic simulations to learn to predict the higher moments of the Vlasov equation from the lower moments, which would also be available in fluid simulations. We train on simulations of magnetic reconnection in a double Harris current sheet with varying background guide field obtained with the semi-implicit Particle-in-Cell code iPiC3D (Markidis et al, 2010). We test the influence of data preprocessing techniques, of (hyper-)parameter variations and of different architectures of the neural networks on the quality of the predictions that are produced. Furthermore, we investigate which metrics are most useful to evaluate the quality of the outcome.

How to cite: Köhne, S., Laperre, B., Amaya, J., Jamal, S., Lautenbach, S., Grauer, R., Lapenta, G., and Innocenti, M. E.: Addressing the closure problem using supervised Machine Learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15981,, 2024.

EGU24-18534 | ECS | Posters on site | ESSI1.5

Visualizing three years of STIX X-ray flare observations using self-supervised learning 

Mariia Drozdova, Vitaliy Kinakh, Francesco Ramunno, Erica Lastufka, and Slava Voloshynovskiy

Operating continuously for over three years, Solar Orbiter's STIX has observed more than 43 thousand X-ray flares. This study presents a compelling visualization of this publicly available database, using self-supervised learning to organize reconstructed flare images by their visual properties. Networks designed for self-supervised learning, such as Masked Siamese Networks or Autoencoders, are able to learn latent space embeddings which encode core characteristics of the data. We investigate the effectiveness of various pre-trained vision models, fine-tuning strategies, and image preparation. This visual representation offers a valuable starting point for identifying interesting events and grouping flares based on shared morphological characteristics, useful for conducting statistical studies or finding unique flares in this rich set of observations.

How to cite: Drozdova, M., Kinakh, V., Ramunno, F., Lastufka, E., and Voloshynovskiy, S.: Visualizing three years of STIX X-ray flare observations using self-supervised learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18534,, 2024.

EGU24-19248 | Posters on site | ESSI1.5

Segmentation and Tracking of Solar Eruptive Phenomena with Convolutional Neural Networks (CNN) 

Oleg Stepanyuk and Kamen Kozarev

Solar eruptive events are complex phenomena, which most often include coronal mass ejections (CME), flares, compressive/shock waves, and filament eruptions. CMEs are large eruptions of magnetized plasma from the Sun’s outer atmosphere or corona, that propagate outward into the interplanetary space. Solar Energetic Particles (SEP) are produced through particle acceleration in flares or CME-driven shocks. Exact mechanisms behind SEP production are yet to be understood, but it is thought that most of their acceleration occurs in shocks starting in the low corona. Over the last several decades a large amount of remote solar eruption observations have become available from ground-based and space-borne instruments. This has required the development of software approaches for automated characterization of eruptive features. Most solar feature detection and tracking algorithms currently in use have restricted applicability and complicated processing chains, while the complexities in engineering machine learning (ML) training sets limit the use of data-driven approaches for tracking or solar eruptive related phenomena. Recently, we introduced a hybrid algorithmic—data driven approach for characterization and tracking of solar eruptive features with the improved wavelet-based, multi-instrument Wavetrack package (Stepanyuk, J. Space Weather Space Clim. (2024)), which was used to produce training datasets for data driven image segmentation with convolutional neural networks (CNN). Its perfomance was shown on a limited set of SDO AIA 193A instrument data perfoming segmentation of EUV and shock waves. Here we extend this approach and present an ensemble of more general CNN models for data-driven segmentation of various eruptive phenomena for the set of ground-based and remote instruments data. We discuss our approach to engineering training sets and data augmentation, CNN topology and training techniques. 

How to cite: Stepanyuk, O. and Kozarev, K.: Segmentation and Tracking of Solar Eruptive Phenomena with Convolutional Neural Networks (CNN), EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19248,, 2024.

EGU24-19558 | Orals | ESSI1.5

Comparative Analysis of Data Preprocessing Methods for Precise Orbit Determination 

Tom Andert, Benedikt Aigner, Fabian Dallinger, Benjamin Haser, Martin Pätzold, and Matthias Hahn

In Precise Orbit Determination (POD), employing proper methods for pre-processing tracking data is crucial not only to mitigate data noise but also to identify potential unmodeled effects that may elude the prediction model of the POD algorithm. Unaccounted effects can skew parameter estimation, causing certain parameters to assimilate the unmodeled effects and deviate from their true values. Therefore, enhancing the pre-processing of tracking data ultimately contributes to refining the prediction model.

The Rosetta spacecraft, during its two-year mission alongside comet 67P/Churyumov-Gerasimenko, collected a substantial dataset of tracking data. In addition to this data, also tracking data from the Mars Express spacecraft, orbiting Mars since 2004, will serve as a use case to assess and compare diverse data pre-processing methods. Both traditional and AI-based techniques are explored to examine the impact of various strategies on the accuracy of orbit determination. This aims to enhance POD, thereby yielding a more robust scientific outcome.

How to cite: Andert, T., Aigner, B., Dallinger, F., Haser, B., Pätzold, M., and Hahn, M.: Comparative Analysis of Data Preprocessing Methods for Precise Orbit Determination, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19558,, 2024.

EGU24-21463 | ECS | Posters on site | ESSI1.5

A machine learning approach to meteor light curve analysis 

Lucas Mandl, Apostolous Christou, and Andreas Windisch

In this work we conduct a thorough examination of utilizing machine learning and computer
vision techniques for classifying meteors based on their characteristics. The focus of the re-
search is the analysis of light curves emitted by meteors as they pass through the Earth’s atmo-
sphere, including aspects such as luminosity, duration, and shape. Through extracting features
from these light curves and comparing them to established meteors orbits, valuable informa-
tion about the meteor’s origin and chemical composition is sought to be obtained. A significant
contribution of the thesis is the development of methods for classifying meteors by extracting
features from the light curve shape through the usage of unsupervised classification algorithms.
This approach allows for the automatic classification of meteors into various groups based on
their properties. Data for the research is collected by a three-camera setup at the Armagh observatory,
comprising one medium-angle camera and
two wide-angle cameras. This setup enables the capturing of detailed images of meteor light
curves, as well as various other observations such as coordinate and angular data. The research
also involves the use of machine learning algorithms for data reduction and classification tasks.
By applying these techniques to the data collected from the camera setup, the identification of
parent objects based on chemical composition and meteor path is facilitated, along with the
acquisition of other valuable information about the meteors.

How to cite: Mandl, L., Christou, A., and Windisch, A.: A machine learning approach to meteor light curve analysis, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-21463,, 2024.

EGU24-566 | ECS | Orals | HS3.4

A Transformer-Based Data-Driven Model for Real-Time Spatio-Temporal Flood Prediction 

Matteo Pianforini, Susanna Dazzi, Andrea Pilzer, and Renato Vacondio

Among the non-structural strategies for mitigating the huge economic losses and casualties caused by floods, the implementation of early-warning systems based on real-time forecasting of flood maps is one of the most effective. The high computational cost associated with two-dimensional (2D) hydrodynamic models, however, prevents their practical application in this context. To overcome this drawback, “data-driven” models are gaining considerable popularity due to their high computational efficiency for predictions. In this work, we introduce a novel surrogate model based on the Transformer architecture, named FloodSformer (FS), that efficiently predicts the temporal evolution of inundation maps, with the aim of providing real-time flood forecasts. The FS model combines an encoder-decoder (2D Convolutional Neural Network) with a Transformer block that handles temporal information. This architecture extracts the spatiotemporal information from a sequence of consecutive water depth maps and predicts the water depth map at one subsequent instant. An autoregressive procedure, based on the trained surrogate model, is employed to forecast tens of future maps.

As a case study, we investigated the hypothetical inundation due to the collapse of the flood-control dam on the Parma River (Italy). Due to the absence of real inundation maps, the training/testing dataset for the FS model was generated from numerical simulations performed through a 2D shallow‐water code (PARFLOOD). Results show that the FS model is able to recursively forecast the next 90 water depth maps (corresponding to 3 hours for this case study, in which maps are sampled at 2-minute intervals) in less than 1 minute. This is achieved while maintaining an accuracy deemed entirely acceptable for real-time applications: the average Root Mean Square Error (RMSE) is about 10 cm, and the differences between ground-truth and predicted maps are generally lower than 25 cm in the floodable area for the first 60 predicted frames. In conclusion, the short computational time and the good accuracy ensured by the autoregressive procedure make the FS model suitable for early-warning systems.

How to cite: Pianforini, M., Dazzi, S., Pilzer, A., and Vacondio, R.: A Transformer-Based Data-Driven Model for Real-Time Spatio-Temporal Flood Prediction, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-566,, 2024.

EGU24-571 | ECS | Posters on site | HS3.4

Hydrological Significance of Input Sequence Lengths in LSTM-Based Streamflow Prediction 

Farzad Hosseini Hossein Abadi, Cristina Prieto Sierra, Grey Nearing, Cesar Alvarez Diaz, and Martin Gauch


Hydrological modeling of flashy catchments, susceptible to floods, represents a significant practical challenge.  Recent application of deep learning, specifically Long Short-Term Memory networks (LSTMs), have demonstrated notable capability in delivering accurate hydrological predictions at daily and hourly time intervals (Gauch et al., 2021; Kratzert et al., 2018).

In this study, we leverage a multi-timescale LSTM (MTS-LSTM (Gauch et al., 2021)) model to predict hydrographs in flashy catchments at hourly time scales. Our primary focus is to investigate the influence of model hyperparameters on the performance of regional streamflow models. We present methodological advancements using a practical application to predict streamflow in 40 catchments within the Basque Country (North of Spain).

Our findings show that 1) hourly and daily streamflow predictions exhibit high accuracy, with Nash-Sutcliffe Efficiency (NSE) reaching values as high as 0.941 and 0.966 for daily and hourly data, respectively; and 2) hyperparameters associated with the length of the input sequence exert a substantial influence on the performance of a regional model. Consistently optimal regional values, following a systematic hyperparameter tuning, were identified as 3 years for daily data and 12 weeks for hourly data. Principal component analysis (PCA) shows that the first principal component explains 12.36% of the variance among the 12 hyperparameters. Within this set of hyperparameters, the input sequence lengths for hourly data exhibit the highest load in PC1, with a value of -0.523; the load of the input sequence length for daily data is also very high (-0.36). This suggests that these hyperparameters strongly contribute to the model performance.

Furthermore, when utilizing a catchment-scale magnifier to determine optimal hyperparameter settings for each catchment, distinctive sequence lengths emerge for individual basins. This underscores the necessity of customizing input sequence lengths based on the “uniqueness of the place” (Beven, 2020), suggesting that each catchment may demand specific hydrologically meaningful daily and hourly input sequence lengths tailored to its unique characteristics. In essence, the true input sequence length of a catchment may encapsulate hydrological information pertaining to water transit over short and long-term periods within the basin. Notably, the regional daily sequence length aligns with the highest local daily sequence values across all catchments.

In summary, our investigation stresses the critical role of the input sequence length as a hyperparameter in LSTM networks. More broadly, this work is a step towards a better understanding and achieving accurate hourly predictions using deep learning models.



Hydrological modeling; Streamflow Prediction; LSTM networks; Hyperparameters configurations; Input sequence lengths



Beven, K. (2020). Deep learning, hydrological processes and the uniqueness of place. Hydrological Processes, 34(16), 3608–3613. doi:10.1002/hyp.13805

Gauch, M., Kratzert, F., Klotz, D., Nearing, G., Lin, J., and Hochreiter, S. (2021). Rainfall–runoff prediction at multiple timescales with a single Long Short-Term Memory network, Hydrol. Earth Syst. Sci., 25, 2045–2062, DOI:10.5194/hess-25-2045-2021.

Kratzert, F., Klotz, D., Brenner, C., Schulz, K., & Herrnegger, M. (2018). Rainfall--runoff modelling using Long Short-Term Memory (LSTM) networks. Hydrology and Earth System Sciences, 22(11), 6005–6022. DOI:10.5194/hess-22-6005-2018.


How to cite: Hosseini Hossein Abadi, F., Prieto Sierra, C., Nearing, G., Alvarez Diaz, C., and Gauch, M.: Hydrological Significance of Input Sequence Lengths in LSTM-Based Streamflow Prediction, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-571,, 2024.

EGU24-811 | ECS | Orals | HS3.4

A deep learning approach for spatio-temporal prediction of stable water isotopes in soil moisture 

Hyekyeng Jung, Chris Soulsby, and Dörthe Tetzlaff

Water flows and related mixing dynamics in the unsaturated zone are difficult to measure directly, so stable water isotope tracers have been used successfully to quantify flux and storage dynamics and to constrain process-based hydrological models as proxy data. In this study, a data-driven model based on deep learning was adapted to interpolate and extrapolate spatio-temporal isotope signals of δ18O and δ2H in soil water in three dimensions. Further, this was also used to help quantify evapotranspiration and groundwater recharge processes in the unsaturated zone. To consider both spatial and temporal dependencies of water isotope signals in the model design, the output space was decomposed into temporal basis functions and spatial coefficients using singular value decomposition. Then, temporal functions and spatial coefficients were predicted separately by specialized deep learning models in interdependencies among target data, such as the LSTM model and convolutional neural network. Finally, the predictions by the models were integrated and analyzed post-hoc using XAI tools.

Such an integrated framework has the potential to improve understanding of model behavior based on features (e.g., climate, hydrological component) connected to either temporal or spatial information. Furthermore, the model can serve as a surrogate model for process-based hydrological models, improving the use of process-based models as learning tools.

How to cite: Jung, H., Soulsby, C., and Tetzlaff, D.: A deep learning approach for spatio-temporal prediction of stable water isotopes in soil moisture, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-811,, 2024.

EGU24-2872 | ECS | Posters on site | HS3.4

Runoff coefficient modelling using Long Short-Term Memory (LSTM) in the Rur catchment, Germany 

Arash Rahi, Mehdi Rahmati, Jacopo Dari, and Renato Morbidelli

This research examines the effectiveness of Long Short-Term Memory (LSTM) models in predicting runoff coefficient (Rc) within the Rur basin at the Stah outlet (Germany) during the period from 1961 to 2021; monthly data of temperature (T), precipitation (P), soil water storage (SWS), and total evaporation (ETA) are used as an input. Because of the complexity in predicting undecomposed Rc time series due to noise, a novel approach incorporating discrete wavelet transform (DWT) to decompose the original Rc at five levels is proposed.

The investigation identifies overfitting challenges at level-1, gradually mitigated in subsequent decomposition levels, particularly in level-2, while other levels remain tuned. Reconstructing Rc using modelled decomposition coefficients yielded Nash-Sutcliffe efficacy (NSE) values of 0.88, 0.79, and 0.74 for the training, validation, and test sets, respectively. Comparative analysis highlights that modelling undecomposed Rc with LSTM yields to a minor accuracy, emphasizing the pivotal role of decomposition techniques in tandem with LSTM for enhanced model performances.

This study provides novel insights to address challenges related to noise effects and temporal dependencies in Rc modelling; through a comprehensive analysis of the interplay between atmospheric conditions and observed data, the research contributes in advancing predictive modelling in hydrology.

How to cite: Rahi, A., Rahmati, M., Dari, J., and Morbidelli, R.: Runoff coefficient modelling using Long Short-Term Memory (LSTM) in the Rur catchment, Germany, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2872,, 2024.

EGU24-2939 | ECS | Orals | HS3.4

Probabilistic streamflow forecasting using generative deep learning models 

Mohammad Sina Jahangir and John Quilty

The significance of probabilistic hydrological forecasting has grown in recent years, offering crucial insights for risk-based decision-making and effective flood management. This study explores generative deep learning models, specifically the conditional variational autoencoder (CVAE), for probabilistic streamflow forecasting. This innovative approach is applied for forecasting streamflow one to seven days (s) ahead in 75 Canadian basins included in the open-source Canadian model parameter experiment (CANOPEX) database. CVAE is compared against two benchmark quantile-based deep learning models: the quantile-based encoder-decoder (ED) and the quantile-based CVAE (QCVAE).

Over 9000 deep learning models are developed based on different input variables, basin characteristics, and model structures and evaluated regarding point forecast accuracy and forecast reliability. Results highlight CVAE‘s superior reliability, showing a median reliability of 92.49% compared to 87.35% for ED and 84.59% for QCVAE (considering a desired 90% confidence level). However, quantile-based forecast models exhibit marginally better point forecasts, as evidenced by Kling-Gupta efficiency (KGE), with a median KGE of 0.90 for ED and QCVAE (compared to 0.88 for CVAE). Notably, the CVAE model provides reliable probabilistic forecasts in basins with low point forecast accuracy.

The developed generative deep learning models can be used as a benchmark for probabilistic streamflow forecasting due to the use of the open-source CANOPEX dataset. Overall, the results of this study contribute to the expanding field of generative deep learning models in hydrological forecasting, offering a general framework that applies to forecasting other hydrological variables as well (precipitation and soil moisture).

How to cite: Jahangir, M. S. and Quilty, J.: Probabilistic streamflow forecasting using generative deep learning models, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2939,, 2024.

EGU24-6432 | ECS | Orals | HS3.4

A Step Towards Global Hydrologic Modelling: Accurate Streamflow Predictions In Pseudo-Ungauged Basins of Japan 

Hemant Servia, Frauke Albrecht, Samuel Saxe, Nicolas Bierti, Masatoshi Kawasaki, and Shun Kurihara

In addressing the challenge of streamflow prediction in ungauged basins, this study leveraged deep learning (DL) models, especially long short-term memory (LSTM) networks, to predict streamflow for pseudo ungauged basins in Japan. The motivation stems from the recognized limitations of traditional hydrological models in transferring their performance beyond the calibrated basins. Recent research suggests that DL models, especially those trained on multiple catchments, demonstrate improved predictive capabilities utilizing the concept of streamflow regionalization. However, the majority of these studies were confined to geographic regions within the United States.

For this study, a total number of 211 catchments were delineated and investigated, distributed across all four primary islands of Japan (Kyushu - 32, Shikoku - 13, Honshu - 127, and Hokkaido - 39) encompassing a comprehensive sample of hydrological systems within the region. The catchments were obtained corresponding to the streamflow observation points and their combined area represented more than 43% of Japan's total land area, after accounting for overlaps. After cleaning and refining the streamflow dataset, the remaining catchments (198) were divided into training (~70%), validation (~20%), and holdout test (~10%) sets. A combination of dynamic (time-varying) and static (constant) variables were obtained on a daily basis corresponding to the daily streamflow data and provided to the models as input features. However, the final model accorded higher significance to dynamic features in comparison to the static ones. Although the models were trained on daily time steps, the results were aggregated to monthly timescale. The main evaluation metrics included the Nash-Sutcliffe Efficiency (NSE) and Pearson’s correlation coefficient (r). The final model achieved a median NSE of 0.96, 0.83, & 0.78, and a median correlation of 0.98, 0.92, & 0.91 corresponding to the training, validation, and test catchments, respectively. For the validation catchments, 90% exhibited NSE values greater than 0.50, and 97% demonstrated a correlation surpassing 0.70. Correspondingly, these proportions were observed at 77% and 91% for the test catchments.

The results presented in this study demonstrate the feasibility and efficacy of developing a data-driven model for streamflow prediction in ungauged basins utilizing streamflow regionalization. The final model exhibits commendable performance, as evidenced by high NSE and correlation coefficients across the majority of the catchments. Importantly, the model's ability to generalize to unseen data is highlighted by its remarkable performance on the holdout test set, with only a few instances of lower NSE values (< 0.50) and correlation coefficients (< 0.70).

How to cite: Servia, H., Albrecht, F., Saxe, S., Bierti, N., Kawasaki, M., and Kurihara, S.: A Step Towards Global Hydrologic Modelling: Accurate Streamflow Predictions In Pseudo-Ungauged Basins of Japan, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6432,, 2024.

EGU24-6846 | ECS | Orals | HS3.4

Towards Fully Distributed Rainfall-Runoff Modelling with Graph Neural Networks 

Peter Nelemans, Roberto Bentivoglio, Joost Buitink, Ali Meshgi, Markus Hrachowitz, Ruben Dahm, and Riccardo Taormina

Fully distributed hydrological models take into account the spatial variability of a catchment, allowing for a more accurate representation of its heterogeneity, and assessing its hydrological response at multiple locations. However, physics-based fully distributed models can be time-consuming when it comes to model runtime and calibration, especially for large-scale catchments. On the other hand, deep learning models have shown great potential in the field of hydrological modelling, outperforming lumped rainfall-runoff conceptual models, and improving prediction in ungauged basins via catchment transferability. Despite these advances, the field still lacks a multivariable, fully distributed hydrological deep learning model capable of generalizing to unseen catchments. To address the aforementioned challenges associated with physics-based distributed models and deep learning models, we explore the possibility of developing a fully distributed deep learning model by using Graph Neural Networks (GNN), an extension of deep learning methods to non-Euclidean topologies including graphs and meshes.

We develop a surrogate model of wflow_sbm, a fully distributed, physics-based hydrological model, by exploiting the similarities between its underlying functioning and GNNs. The GNN uses the same input as wflow_sbm: distributed static parameters based on physical characteristics of the catchment and gridded dynamic forcings. The GNN is trained to produce the same output as wflow_sbm, predicting multiple gridded variables related to rainfall-runoff, such as streamflow, actual evapotranspiration, subsurface flow, saturated and unsaturated groundwater storage, snow storage, and runoff. We show that our GNN model achieves high performance in unseen catchments, indicating that GNNs are a viable option for fully distributed multivariable hydrological models capable of generalizing to unseen regions. Furthermore, the GNN model achieves a significant computational speedup compared to wflow_sbm. We will continue this research, using the GNN-based surrogate models as pre-trained backbones to be fine-tuned with measured data, ensuring accurate model adaptation, and enhancing their practical applicability in diverse hydrological scenarios.

How to cite: Nelemans, P., Bentivoglio, R., Buitink, J., Meshgi, A., Hrachowitz, M., Dahm, R., and Taormina, R.: Towards Fully Distributed Rainfall-Runoff Modelling with Graph Neural Networks, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6846,, 2024.

This research created a deep neural network (DNN)-based hydrologic model for an urban watershed in South Korea using multiple LSTM (long short-term memory) units and a fully connected layer. The model utilized 10-minute intervals of radar-gauge composite precipitation and temperature data across 239 grid cells, each 1 km in resolution, to simulate watershed flow discharge every 10 minutes. It showed high accuracy during both the calibration (2013–2016) and validation (2017–2019) periods, with Nash–Sutcliffe efficiency coefficient values of 0.99 and 0.67, respectively. Key findings include: 1) the DNN model's runoff–precipitation ratio map closely matched the imperviousness ratio map from land cover data, demonstrating the model's ability to learn precipitation partitioning without prior hydrological information; 2) it effectively mimicked soil moisture-dependent runoff processes, crucial for continuous hydrologic models; and 3) the LSTM units displayed varying temporal responses to precipitation, with units near the watershed outlet responding faster, indicating the model's capability to differentiate between hydrological components like direct runoff and groundwater-driven baseflow.

How to cite: Kim, D.: Exploring How Machines Model Water Flow: Predicting Small-Scale Watershed Behavior in a Distributed Setting, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-7186,, 2024.

EGU24-8102 | ECS | Orals | HS3.4

Improving the Generalizability of Urban Pluvial Flood Emulators by Contextualizing High-Resolution Patches 

Tabea Cache, Milton S. Gomez, Jovan Blagojević, Tom Beucler, João P. Leitão, and Nadav Peleg

Predicting future flood hazards in a changing climate requires adopting a stochastic framework due to the multiple sources of uncertainties (e.g., from climate change scenarios, climate models, or natural variability). This requires performing multiple flood inundation simulations which are computationally costly. Data-driven models can help overcome this issue as they can emulate urban flood maps considerably faster than traditional flood simulation models. However, their lack of generalizability to both terrain and rainfall events still limits their application. Additionally, these models face the challenge of not having sufficient training data. This led state-of-the-art models to adopt a patch-based framework, where the study area is first divided into local patches (i.e., broken into smaller terrain images) that are subsequently merged to reconstruct the whole study area prediction. The main drawback of this method is that the model is blind to the surroundings of the local patch. To overcome this bottleneck, we developed a new deep learning model that includes patches' contextual information while keeping high-resolution information of the local patch. We trained and tested the model in the city of Zurich, at spatial resolution of 1 m. The evaluation focused on 1-hour rainfall events at 5 min temporal resolution and encompassing extreme precipitation return periods from 2- to 100-year. The results show that the proposed CNN-attention model outperforms the state-of-the-art patch-based urban flood emulator. First, our model can faithfully represent flood depths for a wide range of extreme rainfall events (peak rainfall intensities ranging from 42.5 mm h-1 to 161.4 mm h-1). Second, the model's terrain generalizability was assessed in distinct urban settings, namely Luzern and Singapore. Our model accurately identifies water accumulation locations, which constitutes an improvement compared to current models. Using transfer learning, the model was successfully retrained in the new cities, requiring only a single rainfall event to adapt the model to new terrains while preserving adaptability across diverse rainfall conditions. Our results suggest that by integrating contextual terrain information with local terrain patches, our proposed model effectively generates high-resolution urban pluvial flood maps, demonstrating applicability across varied terrains and rainfall events.

How to cite: Cache, T., Gomez, M. S., Blagojević, J., Beucler, T., Leitão, J. P., and Peleg, N.: Improving the Generalizability of Urban Pluvial Flood Emulators by Contextualizing High-Resolution Patches, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8102,, 2024.

EGU24-9190 | ECS | Orals | HS3.4

Learning Catchment Features with Autoencoders 

Alberto Bassi, Antonietta Mira, Marvin Höge, Fabrizio Fenicia, and Carlo Albert

By employing Machine Learning techniques on the US-CAMELS dataset, we discern a minimal number of streamflow features. Together with meteorological forcing, these features enable an approximate reconstruction of the entire streamflow time-series. This task is achieved through the application of an explicit noise conditional autoencoder, wherein the meteorological forcing is inputted to the decoder to encourage the encoder to learn streamflow features exclusively related to landscape properties. The optimal number of encoded features is determined with an intrinsic dimension estimator. The accuracy of reconstruction is then compared with models that take a subset of static catchment attributes (both climate and landscape attributes) in addition to meteorological forcing variables. Our findings suggest that attributes gathered by experts encompass nearly all pertinent information regarding the input/output relationship. This information can be succinctly summarized with merely three independent streamflow features. These features exhibit a strong correlation with the baseflow index and aridity indicators, aligning with the observation that predicting streamflow in dry catchments or with a high baseflow index is more challenging. Furthermore, correlation analysis underscores the significance of soil-related and vegetation attributes. These learned features can also be associated with parameters in conceptual hydrological models such as the GR model family.

How to cite: Bassi, A., Mira, A., Höge, M., Fenicia, F., and Albert, C.: Learning Catchment Features with Autoencoders, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-9190,, 2024.

EGU24-9446 | ECS | Orals | HS3.4

Skilful prediction of mid-term sea surface temperature using 3D self-attention-based neural network 

Longhao Wang, Yongqiang Zhang, and Xuanze Zhang

Sea surface temperature (SST) is a critical parameter in the global ocean-atmospheric system, exerting a substantial impact on climate change and extreme weather events like droughts and floods. The precise forecasting of future SSTs is thus vital for identifying such weather anomalies. Here we present a novel three-dimensional (3D) neural network model based on self-attention mechanisms and Swin-Transformer for mid-term SST predictions. This model, integrating both climatic and temporal features, employs self-attention to proficiently capture the temporal dynamics and global patterns in SST. This approach significantly enhances the model's capability to detect and analyze spatiotemporal changes, offering a more nuanced understanding of SST variations. Trained on 59 years of global monthly ERA5-Land reanalysis data, our model demonstrates strong deterministic forecast capabilities in the test period. It employs a convolution strategy and global attention mechanism, resulting in faster and more accurate training compared to traditional methods, such as Convolutional Neural Network with Long short-term memory (CNN-LSTM). The effectiveness of this SST prediction model highlights its potential for extensive multidimensional modelling applications in geosciences.

How to cite: Wang, L., Zhang, Y., and Zhang, X.: Skilful prediction of mid-term sea surface temperature using 3D self-attention-based neural network, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-9446,, 2024.

Traditional hydrological models have long served as the standard for predicting streamflow across temporal and spatial domains. However, a persistent challenge in modelling lies in mitigating bias inherent in streamflow estimation due to both random and systemic errors in the employed model. Removal of this bias is pivotal for effective water resources management and resilience against extreme events, especially amidst evolving climate conditions. An innovative solution to address this challenge involves the integration of hydrological models with deep learning methods, known as hybridisation. Long Short-Term Memory networks (LSTM), have emerged as a promising and efficient approach to enhancing streamflow estimation. This study focuses on coupling LSTM with a physically distributed model, Wflow_sbm, to serve as a post-processor aimed at reducing modelling errors. The coupled Wflow_sbm-LSTM model was applied to the Boyne catchment in Ireland, utilising a dataset spanning two decades, divided into training, validation, and testing sets to ensure robust model evaluation. Predictive performance was rigorously assessed using metrics like Modified Kling-Gupta Efficiency (MKGE) and Nash-Sutcliffe Efficiency (NSE), with observed streamflow discharges as the target variable. Results demonstrated that the coupled model outperformed the best-calibrated Wflow_sbm model in the study catchment based on the performance measures. The enhanced prediction of extreme events by the coupled Wflow_sbm-LSTM model strengthens the case for its integration into an operational river flow forecasting framework. Significantly, Wflow is endorsed by the National Flood Forecast Warning Service (NFFWS) in Ireland as a recommended model for streamflow simulations, specifically designed for fluvial flood forecasting. Consequently, our proposed Wflow_sbm-LSTM coupled model presents a compelling opportunity for integration into the NFFWS. With demonstrated potential to achieve precise streamflow estimations, this integration holds promise for significantly enhancing the accuracy and effectiveness of flood predictions in Ireland.

How to cite: Mohammed, S. and Nasr, A.: Advancing Streamflow Modelling: Bias Removal in Physically-Based Models with the Long Short-Term Memory Networks (LSTM) Algorithm, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-9573,, 2024.

EGU24-10506 | Posters on site | HS3.4

Enhancing Hydrological Predictions: Feature-Driven Streamflow Forecasting with Sparse Autoencoder-based Long Short-Term Memory Networks 

Neha Vinod, Arathy Nair Geetha Raveendran, Adarsh Sankaran, and Anandu Kochukattil Ajith

In response to the critical demand for accurate streamflow predictions in hydrology, this study introduces a Sparse Autoencoder-based Long Short-Term Memory (SA-LSTM) framework applied to daily streamflow data from three-gauge stations within the Greater Pamba River Basin of Kerala, India, which was the worst affected region by the devastating floods of 2018. The SA-LSTM model addresses the challenge of feature selection from an extensive set of corresponding 1 to 7 days lagged climatic variables, such as precipitation, maximum and minimum temperatures, by incorporating a sparsity constraint. This constraint strategically guides the autoencoder to focus on the most influential features for the prediction analysis. The prediction process involves training the SA-LSTM model on historical streamflow data and climatic variables, allowing the model to learn intricate patterns and relationships. Furthermore, this study includes a comparative analysis featuring the Random Forest (RF)-LSTM model, where the RF model is employed for feature extraction, and a separate LSTM model is used for streamflow prediction. While the RF-LSTM combination demonstrates competitive performance, it is noteworthy that the SA-LSTM model consistently outperforms in terms of predictive accuracy. Rigorous evaluation metrics, including Correlation Coefficient (R2), Root Mean Square Error (RMSE), Mean Square Error (MSE), and Mean Absolute Error (MAE), highlight the SA-LSTM's forecasting accuracy across the three stations. Notably, the R2 values surpass 0.85, RMSE values remain under 12 cubic meters per second (m³/s), MSE values are below 70 (m³/s), and MAE values approach 8 m³/s. The detailed comparison between the above models underscores the superior capabilities of the SA-LSTM framework in capturing complex temporal patterns, emphasizing its potential for advancing hydrological modeling and flood risk management in flood-prone regions.


Key words : Streamflow, LSTM, Sparse Autoencoder, Flood, Greater Pamba

How to cite: Vinod, N., Geetha Raveendran, A. N., Sankaran, A., and Kochukattil Ajith, A.: Enhancing Hydrological Predictions: Feature-Driven Streamflow Forecasting with Sparse Autoencoder-based Long Short-Term Memory Networks, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-10506,, 2024.

EGU24-11506 | ECS | Posters on site | HS3.4

Forecasting reservoir inflows with Long Short-Term Memory models 

Laura Soncin, Claudia Bertini, Schalk Jan van Andel, Elena Ridolfi, Francesco Napolitano, Fabio Russo, and Celia Ramos Sánchez

The increased variability of water resources and the escalating water consumption contribute to the risk of stress and water scarcity in reservoirs that are typically designed based on historical conditions. Therefore, it is relevant to provide accurate forecasts of reservoir inflow to optimize sustainable water management as conditions change, especially during extreme events, such as flooding and drought. However, accurate forecasting the inflow is not straightforward, due the uncertainty of the hydrological inputs and the strong non-linearity of the system. Numerous recent studies have employed approaches based on Machine Learning (ML) techniques, such as Artificial Neural Networks (ANN), Long Short-Term Memory (LSTM), and Random Forest (RF), with successful examples of providing skilful site-specific predictions. In particular, LSTM have emerged among the pool of ML models for their performance in simulating rainfall-runoff processes, thanks to their ability to learn long-term dependencies from time series. 
Here we propose an LSTM-based approach for inflow prediction in the Barrios de Luna reservoir, located in the Spanish part of the Douro River Basin. The reservoir has a dual role, as its water is used for irrigation during dry summer periods, and its storage volume is used to mitigate floods. Therefore, in order to operate the reservoir in the short-term, Barrios de Luna reservoir operators need accurate forecast to support water management decisions in the daily and weekly time horizons. In our work, we explore the potential of a LSTM model to predict inflow in the reservoir at varying lead times, ranging from 1 day up to 4 weeks. Initially, we use as inputs past inflow, precipitation and temperature observations, and then we include meteorological forecasts of precipitation and temperature from ECMWF Extended Range. For the latter experiments, different configurations of the LSTM are tested, i.e. training the model with observations and forecasts together and training the model with observations only and fine tune it with forecasts.
Our preliminary results show that precipitation, temperature and inflow observations are all crucial inputs to the LSTM for predicting inflow, and meteorological forecast inputs seem to improve performance for the longer lead-times of one week up to a month.
Predictions developed will contribute to the Douro case study of the CLImate INTelligence (CLINT) H2020 project.

How to cite: Soncin, L., Bertini, C., van Andel, S. J., Ridolfi, E., Napolitano, F., Russo, F., and Ramos Sánchez, C.: Forecasting reservoir inflows with Long Short-Term Memory models, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11506,, 2024.

EGU24-11768 | ECS | Posters on site | HS3.4

High-Efficiency Rainfall Data Compression Using Binarized Convolutional Autoencoder 

Manuel Traub, Fedor Scholz, Thomas Scholten, Christiane Zarfl, and Martin V. Butz

In the era of big data, managing and storing large-scale meteorological datasets is a critical challenge. We focus on high-resolution rainfall data, which is crucial to atmospheric sciences, climate research, and real-time weather forecasting. This study introduces a deep learning-based approach to compress the German Radar-Online-Aneichung (RADOLAN) rainfall dataset. We achieve a compression ratio of 200:1 while maintaining a minimal mean squared reconstruction error (MSE). Our method combines a convolutional autoencoder with a novel binarization mechanism, to compress data from a resolution of 900x900 pixels at 32-bit depth to 180x180 pixels at 4-bit depth. Leveraging the ConvNeXt architecture (Zhuang Liu, et al., 'A ConvNet for the 2020s'), our method learns a convolutional autoencoder for enhanced meteorological data compression. ConvNeXt introduces key architectural modifications, such as revised layer normalization and expanded receptive fields, taking inspiration from Vision Transformer to form a modern ConvNet. Our novel binarization mechanism, pivotal for achieving the high compression ratio, operates by dynamically quantizing the latent space representations using a novel magnitude specific noise injection technique. This quantization not only reduces the data size but also preserves crucial meteorological information as our low reconstruction MSE demonstrates. Beyond rainfall data, our approach shows promise for other types of high-resolution meteorological datasets, such as temperature, humidity, etc. Adapting our method to these modalities could further streamline the data management processes in meteorological deep learning scenarios and thus facilitate efficient storage and processing of diverse meteorological datasets.

How to cite: Traub, M., Scholz, F., Scholten, T., Zarfl, C., and Butz, M. V.: High-Efficiency Rainfall Data Compression Using Binarized Convolutional Autoencoder, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11768,, 2024.

Machine learning has extensively been applied to for flow forecasting in gauged basins. Increasingly, models generating forecasts in some basin(s) of interest are trained using data from beyond the study region. With increasingly large hydrological datasets, a new challenge emerges: given some region of interest, how do you select which basins to include among the training dataset?

There is currently little guidance on selecting data from outside the basin(s) under study. An intuitive approach might be to select data from neighbouring basins, or basins with similar hydrological characteristics. However, a growing body of research suggests that including hydrologically dissimilar basins can in fact produce greater improvements to model generalisation. In this study, we use clustering as a simple yet effective method for identifying temporal and spatial hydrological diversity within a large hydrological dataset. The clustering results are used to generate information-rich subsets of data, that are used for model training. We compare the effects that basin subsets, that represent various hydrological characteristics, have on model generalisation.
Our study shows that data within individual basins, and between hydrologically similar basins, contain high degrees of redundancy. In such cases, training data can be heavily undersampled with no adverse effects – or even moderate improvements to model performance. We also show that spatial hydrological diversity can hugely benefit model training, providing improved generalisation and a regularisation effect.

How to cite: Snieder, E. and Khan, U.: Towards improved spatio-temporal selection of training data for LSTM-based flow forecasting models in Canadian basins, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12293,, 2024.

We propose a hybrid deep learning model that combines long short-term memory networks (LSTMs) to capture both spatial and temporal dependencies in the river system. The LSTM component processes spatial information derived from topographical data and river network characteristics, allowing the model to understand the physical layout of the river basin. Simultaneously, the LSTM component exploits temporal patterns in historical dam release and rainfall data, enabling the model to discern the dynamics of flood propagation. In comparison of previous study, previous results accepted only hydrological models such as HECRAS, FLDWAV, FLUMEN. But, this study accept combination of HECRAS and Deep Learning algorithm, LSTM. The goal of this study is to predict the river highest level and travel time by dam release 3 to 6 hours in advance throughout the Seomjin river basin. In order to achieve, this study conducted hydrological modeling (HECRAS) and developed a deep learning algorithm (LSTM). Afterward, the developed model combining HECRAS and LSTM was verified at six flood alert stations. Finally, the models will provide the river highest level and travel time information up to 6 hours in advance at six flood alert stations. To train and validate the model, we compile a comprehensive dataset of historical dam release events and corresponding flood travel times from a range of river basins. The dataset includes various hydrological and meteorological features to ensure the model's robustness in handling diverse scenarios. The deep learning model is then trained using a subset of the data and validated against unseen events to assess its generalization capabilities. Preliminary results indicate that the hybrid HECRAS-LSTM model outperforms traditional hydrological models in predicting flood travel times. The model exhibits improved accuracy, particularly in cases of complex river geometries and extreme weather events. Additionally, the model demonstrates its potential for real-time forecasting, as it can efficiently process and assimilate incoming data. In conclusion, our study showcases the effectiveness of using a hybrid HECRAS-LSTM model for forecasting flood travel time by dam release. By leveraging the power of deep learning, we pave the way for more precise and reliable flood predictions, contributing to the overall resilience and safety of communities located downstream of dam-controlled river systems.

How to cite: Kang, J., Lee, G., Park, S., Jung, C., and Yu, J.: The Development of Forecasting System Flood Travel Time by Dam Release for Supplying Flood Information Using Deep Learning at Flood Alert Stations in the Seomjin River Basin, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13848,, 2024.

EGU24-14765 | ECS | Posters virtual | HS3.4

Optimizing Groundwater Forecasting: Comparative Analysis of MLP Models Using Global and Regional Precipitation Data 

Akanksha Soni, Surajit Deb Barma, and Amai Mahesha

This study investigates the efficacy of Multi-Layer Perceptron (MLP) models in groundwater level modeling, specifically emphasizing the pivotal role of input data quality, particularly precipitation data. Unlike prior research that primarily focused on regional datasets like those from the India Meteorological Department (IMD), our research explores the integration of global precipitation data, specifically leveraging the Integrated Multi-satellitE Retrievals for Global Precipitation Measurement (IMERG) dataset for MLP-based modeling. The assessment was conducted using two wells in Dakshina Kannada, evaluating four MLP models (GA-MLP, EFO-MLP, PSO-MLP, AAEO-MLP) with IMERG and IMD precipitation data. Performance metrics were employed, including mean absolute error, root mean square error, normalized Nash-Sutcliffe efficiency, and Pearson's correlation index. The study also includes convergence analysis and stability assessments, revealing the significant impact of the precipitation dataset on model performance. Noteworthy findings include the superior performance of the AAEO-MLP model in training with IMD data and the GA-MLP model's outperformance in testing at the Bajpe well with both datasets. The stability of the GA-MLP model, indicated by the lowest standard deviation values in convergence analysis, underscores its reliability. Moreover, transitioning to the IMERG dataset improved model performance and reduced variability, providing valuable insights into the strengths and limitations of MLP models in groundwater-level modeling. These results advance the precision and dependability of groundwater level forecasts, thereby supporting more effective strategies for international groundwater resource management.

How to cite: Soni, A., Barma, S. D., and Mahesha, A.: Optimizing Groundwater Forecasting: Comparative Analysis of MLP Models Using Global and Regional Precipitation Data, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14765,, 2024.

EGU24-15248 | ECS | Orals | HS3.4

Estimation of Small Stream Water Surface Elevation Using UAV Photogrammetry and Deep Learning 

Radosław Szostak, Mirosław Zimnoch, Przemysław Wachniew, Marcin Pietroń, and Paweł Ćwiąkała

Unmanned aerial vehicle (UAV) photogrammetry allows the generation of orthophoto and digital surface model (DSM) rasters of a terrain. However, DSMs of water bodies mapped using this technique often reveal distortions in the water surface, thereby impeding the accurate sampling of water surface elevation (WSE) from DSMs. This study investigates the capability of deep neural networks to accommodate the aforementioned perturbations and effectively estimate WSE from photogrammetric rasters. Convolutional neural networks (CNNs) were employed for this purpose. Three regression approaches utilizing CNNs were explored: i) direct regression employing an encoder, ii) prediction of the weight mask using an encoder-decoder architecture, subsequently used to sample values from the photogrammetric DSM, and iii) a solution based on the fusion of the two approaches. The dataset employed in this study comprises data collected from five case studies of small lowland streams in Poland and Denmark, consisting of 322 DSM and orthophoto raster samples. Each sample corresponds to a 10 by 10 meter area of the stream channel and adjacent land. A grid search was employed to identify the optimal combination of encoder, mask generation architecture, and batch size among multiple candidates. Solutions were evaluated using two cross-validation methods: stratified k-fold cross-validation, where validation subsets maintained the same proportion of samples from all case studies, and leave-one-case-out cross-validation, where the validation dataset originates entirely from a single case study, and the training set consists of samples from other case studies. The proposed solution was compared with existing methods for measuring water levels in small streams using a drone. The results indicate that the solution outperforms previous photogrammetry-based methods and is second only to the radar-based method, which is considered the most accurate method available.

This research was funded by National Science Centre, Poland, project WATERLINE (2020/02/Y/ST10/00065), under the CHISTERA IV programme of the EU Horizon 2020 (Grant no 857925).

How to cite: Szostak, R., Zimnoch, M., Wachniew, P., Pietroń, M., and Ćwiąkała, P.: Estimation of Small Stream Water Surface Elevation Using UAV Photogrammetry and Deep Learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15248,, 2024.

EGU24-16234 | ECS | Posters on site | HS3.4

A bottom-up approach to identify important hydrological processes by evaluating a national scale EA-LSTM model for Denmark 

Grith Martinsen, Niels Agertoft, and Phillip Aarestrup

The utilization of data-driven models in hydrology has witnessed a significant increase in recent years. The open-source philosophy underpinning much of the code developed and research being conducted has facilitated widespread access to the the hydrological community to sophisticated machine learning models and technology (Reichstein 2019). These data driven approaches to hydrological modelling has witnessed growing interest after multiple studies has shown how machine-learning models were able to outperform nationwide traditional physics-based hydrological models (Kratzerts et al. 2019). The latter often demands substantial man-hours for development, calibration and fine-tuning to accurately represent relevant hydrological processes.

In this national-scale explorative study we undertake an in-depth examination of Danish catchment hydrology. Our objective is to understand what processes and dynamics are well captured by a purely data driven model without physical constraints, namely the Entity-Aware Long Short-Term Model (EA-LSTM). The model code was developed by Kratzerts et al. (2019) and the analysis build on top of a newly published national CAMELS data set covering 301 catchments in Denmark (Koch and Schneider, 2022), with an average resolution of 130 km2.

Denmark, spanning an area of around 43 000 km2, demonstrates a relatively high data coverage. Presently more than 400 stations record water level measurements in the Danish stream network, while a network of 243 stations have collected meteorological data since 2011. These datasets maintained by the Danish Environmental Protection Agency and the Danish Meteorological Institute, respectively, and are publicly available.

Despite Denmark’s data abundance, Koch and Schneider (2022) demonstrated that the data-driven EA-LSTM model, trained with the CAMELS dataset for Denmark (from now on referred to as the DK-LSTM) were not able to outperform the traditional physics-based hydrological model, against which it was benchmarked. Consequently, performance of the DK-LSTM model could be increased by pre-training it with simulations from a national physics-based model indicating that dominating hydrological processes are not described by the readily available input data in the CAMELS dataset.

This study conducts a comprehensive analysis of Danish catchment hydrology aiming to explore three aspects: 1) the common characteristics of the catchments where the DK-LSTM performs well or encounters challenges, 2) the identification of hydrological characteristics, that exhibit improvement when informing the data-driven model with physics-based model simulations, and 3) an exploration of whether the aforementioned findings can guide us in determining necessary physical constraints and/or input variables that explains the hydrological processes for the data-driven model approach at a national scale, using the example of DK-LSTM.


Koch, J., and Schneider, R. Long short-term memory networks enhance rainfall-runoff modelling at the national scale of Denmark. GEUS Bulletin49., 2022.

Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089-5110,, 2019.

Reichstein, M., Camps-Valls, G., Stevens, B. et al. Deep learning and process understanding for data-driven Earth system science. Nature 566, 195–204., 2019.

How to cite: Martinsen, G., Agertoft, N., and Aarestrup, P.: A bottom-up approach to identify important hydrological processes by evaluating a national scale EA-LSTM model for Denmark, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-16234,, 2024.

EGU24-16474 | ECS | Orals | HS3.4

Short- and mid-term discharge forecasts combining machine learning and data assimilation for operational purpose 

Bob E Saint Fleur, Eric Gaume, Michaël Savary, Nicolas Akil, and Dominique Theriez

In recent years, machine learning models, particularly Long Short-Term Memory (LSTM), have proven to be effective alternatives for rainfall-runoff modeling, surpassing traditional hydrological modeling approaches 1. These models have predominantly been implemented and evaluated for rainfall-runoff simulations. However, operational hydrology often requires short- and mid-term forecasts. To be effective, such forecasts must consider past observed values of the predicted variables, requiring a data assimilation procedure 2,3,4. This presentation will evaluate several approaches based on the combination of open-source machine learning tools and data assimilation strategies for short- and mid-term discharge forecasting of flood and/or drought events. The evaluation is based on the rich and well-documented CAMELS dataset 5,6,7. The tested approaches include: (1) coupling pre-trained LSTMs on the CAMELS database with a Multilayer Perceptron (MLP) for prediction error corrections, (2) direct discharge MLP forecasting models specific for each lead time, including past observed discharges as input variables, and (3) option 2, including the LSTM-predicted discharges as input variables. In the absence of historical archives of weather forecasts (rainfall, temperatures, etc.), the different forecasting approaches will be tested in two configurations: (1) weather forecasts assumed to be perfect (using observed meteorological variables over the forecast horizon in place of predicted variables or ensembles) and (2) use of ensembles reflecting climatological variability over the forecast horizons for meteorological variables ensembles made up of time series randomly selected from the past. The forecast horizons considered range from 1 to 10 days, and the results are analyzed in light of the time of concentration of the watersheds.



1. Kratzert F, Klotz D, Brenner C, Schulz K, Herrnegger M. Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks. Hydrol Earth Syst Sci. 2018;22(11):6005-6022. doi:10.5194/hess-22-6005-2018

2. Bourgin F, Ramos MH, Thirel G, Andréassian V. Investigating the interactions between data assimilation and post-processing in hydrological ensemble forecasting. J Hydrol (Amst). 2014;519:2775-2784. doi:10.1016/j.jhydrol.2014.07.054

3. Boucher M ‐A., Quilty J, Adamowski J. Data Assimilation for Streamflow Forecasting Using Extreme Learning Machines and Multilayer Perceptrons. Water Resour Res. 2020;56(6). doi:10.1029/2019WR026226

4. Piazzi G, Thirel G, Perrin C, Delaigue O. Sequential Data Assimilation for Streamflow Forecasting: Assessing the Sensitivity to Uncertainties and Updated Variables of a Conceptual Hydrological Model at Basin Scale. Water Resour Res. 2021;57(4). doi:10.1029/2020WR028390

5. Newman AJ, Clark MP, Sampson K, et al. Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: data set characteristics and assessment of regional variability in hydrologic model performance. Hydrol Earth Syst Sci. 2015;19(1):209-223. doi:10.5194/hess-19-209-2015

6. Kratzert, F. (2019). Pretrained models + simulations for our HESSD submission "Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets", HydroShare,

7. Kratzert, F. (2019). CAMELS Extended Maurer Forcing Data, HydroShare,

How to cite: Saint Fleur, B. E., Gaume, E., Savary, M., Akil, N., and Theriez, D.: Short- and mid-term discharge forecasts combining machine learning and data assimilation for operational purpose, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-16474,, 2024.

EGU24-17502 | ECS | Posters on site | HS3.4

Towards improved Water Quality Modelling using Neural ODE models 

Marvin Höge, Florian Wenk, Andreas Scheidegger, Carlo Albert, and Andreas Frömelt

Neural Ordinary Differential Equations (ODEs) fuse neural networks with a mechanistic equation framework. This hybrid structure offers both traceability of model states and processes, as it is typical for physics-based models, and the ability of machine learning to encode new functional relations. Neural ODE models have demonstrated high potential in hydrologic predictions and scientific investigation of the related process in the hydrologic cycle, i.e. tasks of water quantity estimation (Höge et al., 2022).

This explicit representation of state variables is key to water quality modelling. There, we typically have several interrelated state variables like nitrate, nitrite, phosphorous, organic matter,…  Traditionally, these states are modelled based on mechanistic kinetic rate expressions that are often only rough approximations of the underlying dynamics. At the same time, this domain of water research suffers from data scarcity and therefore solely data-driven methods struggle to provide accurate predictions reliably. We show how to improve predictions of state dynamics and to foster knowledge gain about the processes in such interrelated systems with multiple states using Neural ODEs. 

Höge, M., Scheidegger, A., Baity-Jesi, M., Albert, C., & Fenicia, F.: Improving hydrologic models for predictions and process understanding using Neural ODEs. Hydrol. Earth Syst. Sci., 26, 5085-5102,

How to cite: Höge, M., Wenk, F., Scheidegger, A., Albert, C., and Frömelt, A.: Towards improved Water Quality Modelling using Neural ODE models, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17502,, 2024.

EGU24-17543 | Orals | HS3.4

Deep-learning-based prediction of damages related to surface water floods for impact-based warning 

Pascal Horton, Markus Mosimann, Severin Kaderli, Olivia Martius, Andreas Paul Zischg, and Daniel Steinfeld

Surface water floods are responsible for a substantial amount of damage to buildings, yet they have received less attention than fluvial floods. Nowadays, both research and insurance companies are increasingly focusing on these phenomena to enhance knowledge and prevention efforts. This study builds upon pluvial-related damage data provided by the Swiss Mobiliar Insurance Company and the Building Insurance of Canton Zurich (GVZ) with the goal of developing a data-driven model for predicting potential damages in future precipitation events.

This work is a continuation of a previous method applied to Swiss data, relying on thresholds based on the quantiles of precipitation intensity and event volume, which, however, resulted in an excessive number of false alarms. First, a logistic regression has been assessed using different characteristics of the precipitation event. Subsequently, a random forest was established, incorporating terrain attributes to better characterize local conditions. Finally, a deep learning model was developed to account for the spatio-temporal properties of the precipitation fields on a domain larger than the targeted 1 km cell. The deep learning model comprises a convolutional neural network (CNN) for 4D precipitation data and subsequent dense layers, incorporating static attributes. The model has been applied to predict the probability of damage occurrence, as well as the damage degree quantified by the number of claims relative to the number of insured buildings.

How to cite: Horton, P., Mosimann, M., Kaderli, S., Martius, O., Zischg, A. P., and Steinfeld, D.: Deep-learning-based prediction of damages related to surface water floods for impact-based warning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17543,, 2024.

EGU24-18073 | ECS | Orals | HS3.4

Operational stream water temperature forecasting with a temporal fusion transformer model 

Ryan S. Padrón, Massimiliano Zappa, and Konrad Bogner

Stream water temperatures influence aquatic biodiversity, agriculture, tourism, electricity production, and water quality. Therefore, stakeholders would benefit from an operational forecasting service that would support timely action. Deep Learning methods are well-suited for this task as they can provide probabilistic forecasts at individual stations of a monitoring network. Here we train and evaluate several state-of-the-art models using 10 years of data from 55 stations across Switzerland. Static features (e.g. station coordinates, catchment mean elevation, area, and glacierized fraction), time indices, meteorological and/or hydrological observations from the past 64 days, and their ensemble forecasts for the following 32 days are included as predictors in the models to estimate daily maximum water temperature for the next 32 days. We find that the Temporal Fusion Transformer (TFT) model performs best for all lead times with a cumulative rank probability score (CRPS) of 0.73 ºC averaged over all stations, lead times and 90 forecasts distributed over 1 full year. The TFT is followed by the Recurrent Neural Network (CRPS = 0.77 ºC), Neural Hierarchical Interpolation for Time Series (CRPS = 0.80 ºC), and Multi-layer Perceptron (CRPS = 0.85 ºC). All models outperform the benchmark ARX model. When factoring out the uncertainty stemming from the meteorological ensemble forecasts by using observations instead, the TFT improves to a CRPS of 0.43 ºC, and it remains the best of all models. In addition, the TFT model identifies air temperature and time of the year as the most relevant predictors. Furthermore, its attention feature suggests a dominant response to more recent information in the summer, and to information from the previous month during spring and autumn. Currently, daily maximum water temperature probabilistic forecasts are produced twice per week and made available at 

How to cite: Padrón, R. S., Zappa, M., and Bogner, K.: Operational stream water temperature forecasting with a temporal fusion transformer model, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18073,, 2024.

EGU24-18154 | ECS | Orals | HS3.4

Can Blended Model Improve Streamflow Simulation In Diverse Catchments ? 

Daneti Arun Sourya and Maheswaran Rathinasamy

Streamflow simulation or rainfall-runoff modelling has been a topic of research for the past few decades which has resulted in a plethora of modelling approaches ranging from physics models to empirical or data driven approaches. There are many physics-based (PB) models available to estimate streamflow, but still there exists uncertainty in model outputs due to incomplete representations of physical processes. Further, with advancements in machine learning (ML) concepts, there have been several attempts but with no/little physical consistency. As a result, models based on ML algorithms may be unreliable if applied to provide future hydroclimate projections where climates and land use patterns are outside the range of training data. 

Here we test blended models built by combining PB model state variables (specifically soil moisture) with ML algorithms on their ability to simulate streamflow in 671 catchments representing diverse conditions across the conterminous United States.

For this purpose, we develop a suite of blended hydrological models by pairing different PB models (Catchment Wetness Index, Catchment Moisture Deficit, GR4J, Australian Water Balance, Single-bucket Soil Moisture Accounting, and Sacramento Soil Moisture Accounting models) with different ML methods such as Long Short Term Memory network (LSTM), eXtreme Gradient Boosting (XGB).

The results indicate that the blended models provide significant improvement in catchments where PB models are underperforming. Furthermore, the accuracy of streamflow estimation is improved in catchments where the ML models failed to estimate streamflow accurately.

How to cite: Sourya, D. A. and Rathinasamy, M.: Can Blended Model Improve Streamflow Simulation In Diverse Catchments ?, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18154,, 2024.

EGU24-18762 | Orals | HS3.4

Benchmarking hydrological models for national scale climate impact assessment 

Elizabeth Lewis, Ben Smith, Stephen Birkinshaw, Helen He, and David Pritchard

National scale hydrological models are required for many types of water sector applications, for example water resources planning. Existing UK national-scale model frameworks are based on conceptual numerical schemes, with an emerging trend towards incorporating deep learning models. Existing literature has shown that groundwater/surface water interactions are key for accurately representing future flows, and these processes are most accurately represented with physically-based hydrological models.

In response to this, our study undertakes a comparative analysis of three national model frameworks (Neural Hydrology, HBV, SHETRAN) to investigate the necessity for physically-based hydrological modelling. The models were run with the full ensemble of bias-corrected UKCP18 12km RCM data which enabled a direct comparison of future flow projections. We show that whilst many national frameworks perform well for the historical period, physically-based models can give substantially different projections of future flows, particularly low flows. Moreover, our study illustrates that the physically-based model exhibits a consistent trajectory in Budyko space between the baseline and future simulations, a characteristic not shared by conceptual and deep learning models. To provide context for these results, we incorporate insights from other national model frameworks, including the eFlag project.

How to cite: Lewis, E., Smith, B., Birkinshaw, S., He, H., and Pritchard, D.: Benchmarking hydrological models for national scale climate impact assessment, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18762,, 2024.

EGU24-20636 | ECS | Orals | HS3.4

Can Attention Models Surpass LSTM in Hydrology? 

Jiangtao Liu, Chaopeng Shen, and Tadd Bindas

Accurate modeling of various hydrological variables is important for water resource management, flood forecasting, and pest control. Deep learning models, especially Long Short-Term Memory (LSTM) models based on Recurrent Neural Network (RNN) structures, have shown significant success in simulating streamflow, soil moisture, and model parameter assessment. With the development of large language models (LLMs) based on attention mechanisms, such as ChatGPT and Bard, we have observed significant advancements in fields like natural language processing (NLP), computer vision (CV), and time series prediction. Despite achieving advancements across various domains, the application of attention-based models in hydrology remains relatively limited, with LSTM models maintaining a dominant position in this field. This study evaluates the performance of 18 state-of-the-art attention-based models and their variants in hydrology. We focus on their performance in streamflow, soil moisture, snowmelt, and dissolved oxygen (DO) datasets, comparing them to LSTM models in both long-term and short-term regression and forecasting. We also examine these models' performance in spatial cross-validation. Our findings indicate that while LSTM models maintain strong competitiveness in various hydrological datasets, Attention models offer potential advantages in specific metrics and time lengths, providing valuable insights into applying attention-based models in hydrology. Finally, we discuss the potential applications of foundation models and how these methods can contribute to the sustainable use of water resources and the challenges of climate change.

How to cite: Liu, J., Shen, C., and Bindas, T.: Can Attention Models Surpass LSTM in Hydrology?, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20636,, 2024.

EGU24-20907 | Orals | HS3.4

Revolutionizing Flood Forecasting with a Generalized Deep Learning Model 

Julian Hofmann and Adrian Holt

The domain of spatial flood prediction is dominated by hydrodynamic models, which, while robust and adaptable, are often constrained by computational requirements and slow processing times. To address these limitations, the integration of Deep Learning (DL) models has emerged as a promising solution, offering the potential for rapid prediction capabilities, while maintaining a high output quality. However, a critical challenge with DL models lies in their requirement for retraining for each new domain area, based on the outputs of hydrodynamic simulations generated for that specific region. This need for domain-specific retraining hampers the scalability and quick deployment of DL models in diverse settings. Our research focuses on bridging this gap by developing a fully generalized DL model for flood prediction.

FloodWaive's approach pivots on creating a DL model that can predict flood events rapidly and accurately across various regions without requiring retraining for each new domain area. The model is trained on a rich dataset derived from numerous hydrodynamic simulations, encompassing a wide spectrum of topographical conditions. This training is designed to enable the model to generalize its predictive capabilities across different domains and weather patterns, thus overcoming the traditional limitation of DL models in this field.

Initial findings from the development phase are promising, showcasing the model's capability to process complex data and provide quick, accurate flood predictions. The success of this fully generalized DL modeling approach could revolutionize applications of flood predictions such as flood forecasting and risk analysis. Regarding the later, real-time evaluation of flood protection measures could become a reality. This would empower urban planners, emergency response teams, and environmental agencies with the ability to make informed decisions quickly, potentially saving lives and reducing economic losses.

While this project is still in its developmental stages, the preliminary results point towards a significant leap in flood forecasting technology. The ultimate goal is to offer a universally deployable, real-time flood prediction tool, significantly enhancing our ability to mitigate the impact of floods worldwide.


How to cite: Hofmann, J. and Holt, A.: Revolutionizing Flood Forecasting with a Generalized Deep Learning Model, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20907,, 2024.

One of the latent difficulties in the fields of climatology, meteorology, and hydrology is the scarce rainfall information available due to the limited or nonexistent instrumentation of river basins, especially in developing countries where the establishment and maintenance of equipment entail high costs relative to the available budget. Hence, the importance of generating alternatives that seek to improve spatial precipitation estimation has been increasing, given the advances in the implementation of computational algorithms that involve Machine Learning techniques. In this study, a multitask convolutional neural network was implemented, composed of an encoder-decoder architecture (U-Net), which simultaneously estimates the probability of rain through a classification model and the precipitation rate through a regression model at a spatial resolution of 2 km2 and a temporal resolution of 10 minutes. The input modalities included data from rain gauge stations, weather radar, and satellite information (GOES 16). For model training,  validation, and testing, a dataset was consolidated with 3 months of information (February to April 2021) with a distribution of 70/15/15 percent, covering the effective coverage range of the Munchique weather radar located in the Andean region of Colombia. The obtained results show a Probability of Detection (POD) of 0.59 and a False Alarm Rate (FAR) of 0.39. Regarding precipitation rate estimation, it is assessed with a Root Mean  Square Error (RMSE) of 1.13 mm/10min. This research highlights the significant capability of deep learning algorithms in reconstructing and reproducing the spatial pattern of rainfall in tropical regions with limited instrumentation. However, there is a need to continue strengthening climatological monitoring networks to achieve significant spatial representativeness, thereby reducing potential biases in model estimations. 

How to cite: Barrios, M., Rubiano, H., and Guevara-Ochoa, C.: Implementation of deep learning algorithms in the sub-hourly rainfall fields estimation from remote sensors and rainfall gauge information in the tropical Andes, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-21431,, 2024.

EGU24-539 | ECS | Orals | HS3.9

Revisiting the common approaches for hydrological model calibration with high-dimensional parameters and objectives  

Songjun Wu, Doerthe Tetzlaff, Keith Beven, and Chris Soulsby

Successful calibration of distributed hydrological models is often hindered by complex model structures, incommensurability between observed and modeled variables, and the complex nature of many hydrological processes. Many approaches have been proposed and compared for calibration, but the comparisons were generally based on parsimonious models with limited objectives. The conclusions could change when more parameters are to be calibrated with multiple objectives and increasing data availability. In this study four different approaches (random sampling, DREAM, NSGA-II, GLUE Limits of acceptability) were tested for a complex application - to calibrate 58 parameters of a hydrological model against 24 objectives (soil moisture and isotopes at 3 depths under vegetation covers). By comparing the simulation performance of parameter sets selected from different approaches, we concluded that random sampling is still usable in high-dimensional parameter space, providing comparable performance to other approaches despite of the poor parameter identifiability. DREAM provided better simulation performance and parameter convergence with informal likelihood functions; however, the difficulty in describing model residual distribution could possibly result in inappropriate formal likelihood functions and thus the poor simulations. Multi-criteria calibration, taking NSGA-II as an example, gave ideal model performance/parameter identifiability and explicitly unravelled the trade-offs between objectives after aggregating them (into 2 or 4); but calibrating against all 24 objectives was hindered by the “curse of dimensionality”, as the increasing dimension exponentially expanded the Pareto front and increased the difficulty to differentiate parameter sets. Finally, Limits of acceptability also provided comparable simulations; moreover, it can be regarded as a learning tool because detailed information about model failures is available for each objective at each timestep. However, the limitation is the insufficient exploration of high-dimensional parameter space due to the use of Latin-Hypercube sampling.

Overall, all approaches showed benefits and limitations, and a general approach to be easily used for such complex calibration cases without trial-and-error is still lacking. By comparing those common approaches, we realised the difficulty to define a proper objective function for many-objective optimisation, either for aggregated scalar function (due to the difficulty of assigning weights or assuming a form for the residual distribution) or the vector function (due to the expansion of the Pareto front). In this context, the Limits of Acceptability approach provided a more flexible way to define the “objective function” for each timestep, though it introduces extra demands in understanding data uncertainties and deciding on what should be considered acceptable. Moreover, in such many-objective optimisation, it is possible that not a single parameter set can capture all the objectives satisfactorily (not in 8 million run in this study).  The non-existence of any global optimal in the sample suggests that the concept of equifinality should be embraced in using an ensemble of comparable parameters to represent such complex systems.

How to cite: Wu, S., Tetzlaff, D., Beven, K., and Soulsby, C.: Revisiting the common approaches for hydrological model calibration with high-dimensional parameters and objectives , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-539,, 2024.

EGU24-1745 | Posters on site | HS3.9

Predictive uncertainty analysis using null-space Monte Carlo  

Husam Baalousha, Marwan Fahs, and Anis Younes

The inverse problem in hydrogeology poses a significant challenge for modelers due to its ill-posed nature and the non-uniqueness of solutions. This challenge is compounded by the substantial computational efforts required for calibrating highly parameterized aquifers, particularly those with significant heterogeneity, such as karst limestone aquifers. While stochastic methods like Monte Carlo simulations are commonly used to assess uncertainty, their extensive computational requirements often limit their practicality.

The Null Space Monte Carlo (NSMC) method provides a parameter-constrained approach to address these challenges in inverse problems, allowing for the quantification of uncertainty in calibrated parameters. This method was applied to the northern aquifer of Qatar, which is characterized by high heterogeneity. The calibration of the model utilized the pilot point approach, and the calibrated results were spatially interpolated across the aquifer area using kriging.

NSMC was then employed to generate 100 sets of parameter-constrained random variables representing hydraulic conductivities. The null space vectors of these random solutions were incorporated into the parameter space derived from the calibrated model. Statistical analysis of the resulting calibrated hydraulic conductivities revealed a wide range, varying from 0.1 to 350 m/d, illustrating the significant variability inherent in the karstic nature of the aquifer.

Areas with high hydraulic conductivity were identified in the middle and eastern parts of the aquifer. These regions of elevated hydraulic conductivity also exhibited high standard deviations, further emphasizing the heterogeneity and complex nature of the aquifer's hydraulic properties.

How to cite: Baalousha, H., Fahs, M., and Younes, A.: Predictive uncertainty analysis using null-space Monte Carlo , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1745,, 2024.

Remote sensing observations hold useful prior information about the terrestrial water cycle. However, combining remote sensing products for each hydrological variable does not close the water balance due to the associated uncertainties. Therefore, there is a need to quantify bias and random errors in the data. This study presents an extended version of the data-driven probabilistic data fusion for closing the water balance at a basin scale. In this version, we implement a monthly 250-m grid-based Bayesian hierarchical model leveraging multiple open-source data of precipitation, evaporation, and storage in an ensemble approach that fully exploits and maximizes the prior information content of the data. The model relates each variable in the water balance to its “true” value using bias and random error parameters with physical nonnegativity constraints. The water balance variables and error parameters are treated as unknown random variables with specified prior distributions. Given an independent set of ground-truth data on water imports and river discharge along with all monthly gridded water balance data, the model is solved using a combination of Markov Chain Monte Carlo sampling and iterative smoothing to compute posterior distributions of all unknowns. The approach is applied to the Hindon Basin, a tributary of the Ganges River, that suffers from groundwater overexploitation and depends on surface water imports. Results provide spatially distributed (i) hydrologically consistent water balance estimates and (ii) statistically consistent error estimates of the water balance data. 

How to cite: Mourad, R., Schoups, G., and Bastiaanssen, W.: A grid-based data-driven ensemble probabilistic data fusion: a water balance closure approach applied to the irrigated Hindon River Basin, India , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2267,, 2024.

EGU24-2300 | ECS | Posters on site | HS3.9

Representing systematic and random errors of eddy covariance measurements in suitable likelihood models for robust model selection  

Tobias Karl David Weber, Alexander Schade, Robert Rauch, Sebastian Gayler, Joachim Ingwersen, Wolfgang Nowak, Efstathios Diamantopoulos, and Thilo Streck

The importance of evapotranspiration (ET) fluxes for the terrestrial water cycle is demonstrated by an overwhelming body of literature. Unfortunately, errors in their measurement contribute significantly to (model) uncertainties in quantifying and understanding ecohydrological systems. Measurements of surface-atmosphere fluxes of water at the ecosystem scale, the eddy covariance method can be considered a powerful technique and considered an important tool to validate ET models. Spatially averaged fluxes of several hundred square meters may be obtained. While the eddy-covariance technique has become a routine method to estimate the turbulent energy fluxes at the soil-atmosphere boundary, it remains not error free. Some of the inherent errors are quantifiable and may be partitioned into systematic and stochastic errors. For model-data comparison, the nature of the measurement error needs to be known to derive knowledge about model adequacy. To this end, we compare several assumptions found in the literature to describe the statistical properties of the error with newly derived descriptions, in this study. We are able to show, how sensitive the assumptions about the error are on the model selection process. We demonstrate this by comparing daily agro-ecosystem ET fluxes simulated with the detailed agro-hydrological model Expert-N to data gathered using the eddy-covariance technique.

How to cite: Weber, T. K. D., Schade, A., Rauch, R., Gayler, S., Ingwersen, J., Nowak, W., Diamantopoulos, E., and Streck, T.: Representing systematic and random errors of eddy covariance measurements in suitable likelihood models for robust model selection , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2300,, 2024.

EGU24-4140 | ECS | Orals | HS3.9

Integrating Deterministic and Probabilistic Approaches for Improved Hydrological Predictions: Insights from Multi-model Assessment in the Great Lakes Watersheds 

Jonathan Romero-Cuellar, Rezgar Arabzadeh, James Craig, Bryan Tolson, and Juliane Mai

The utilization of probabilistic streamflow predictions holds considerable value in the domains of predictive uncertainty estimation, hydrologic risk management, and decision support in water resources. Typically, the quantification of predictive uncertainty is formulated and evaluated using a solitary hydrological model, posing challenges in extrapolating findings to diverse model configurations. To address this limitation, this study examines variations in the performance ranking of various streamflow models through the application of a residual error model post-processing approach across multiple basins and models. The assessment encompasses 141 basins within the Great Lakes watershed, spanning the USA and Canada, and involves the evaluation of 13 diverse streamflow models using deterministic and probabilistic performance metrics. This investigation scrutinizes the interdependence between the quality of probabilistic streamflow estimation and the underlying model quality. The results underscore that the selection of a streamflow model significantly influences the robustness of probabilistic predictions. Notably, transitioning from deterministic to probabilistic predictions, facilitated by a post-processing approach, maintains the performance ranking consistency for the best and worst deterministic models. However, models of intermediate rank in deterministic evaluation exhibit inconsistent rankings when evaluated in probabilistic mode. Furthermore, the study reveals that post-processing residual errors of long short-term memory (LSTM) network models consistently outperform other models in both deterministic and probabilistic metrics. This research emphasizes the importance of integrating deterministic streamflow model predictions with residual error models to enhance the quality and utility of hydrological predictions. It elucidates the extent to which the efficacy of probabilistic predictions is contingent upon the sound performance of the underlying model and its potential to compensate for deficiencies in model performance. Ultimately, these findings underscore the significance of combining deterministic and probabilistic approaches for improving hydrological predictions, quantifying uncertainty, and supporting decision-making in operational water management.

How to cite: Romero-Cuellar, J., Arabzadeh, R., Craig, J., Tolson, B., and Mai, J.: Integrating Deterministic and Probabilistic Approaches for Improved Hydrological Predictions: Insights from Multi-model Assessment in the Great Lakes Watersheds, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4140,, 2024.

EGU24-5219 | ECS | Posters on site | HS3.9

Quantifying Uncertainty in Surrogate-based Bayesian Inference 

Anneli Guthke, Philipp Reiser, and Paul-Christian Bürkner

Proper sensitivity and uncertainty analysis for complex Earth and environmental systems models may become computationally prohibitive. Surrogate models can be an alternative to enable such analyses: they are cheap-to-run statistical approximations to the simulation results of the original expensive model. Several approaches to surrogate modelling exist, all with their own challenges and uncertainties. It is crucial to correctly propagate the uncertainties related to surrogate modelling to predictions, inference and derived quantities in order to draw the right conclusions from using the surrogate model.

While the uncertainty in surrogate model parameters due to limited training data (expensive simulation runs) is often accounted for, what is typically ignored is the approximation error due to the surrogate’s structure (bias in reproducing the original model predictions). Reasons are that such a full uncertainty analysis is computationally costly even for surrogates (or limited to oversimplified analytic cases), and that a comprehensive framework for uncertainty propagation with surrogate models was missing.

With this contribution, we propose a fully Bayesian approach to surrogate modelling, uncertainty propagation, parameter inference, and uncertainty validation. We illustrate the utility of our approach with two synthetic case studies of parameter inference and validate our inferred posterior distributions by simulation-based calibration. For Bayesian inference, the correct propagation of surrogate uncertainty is especially relevant, because failing to account for it may lead to biased and/or overconfident parameter estimates and will spoil further interpretation in the physics’ context or application of the expensive simulation model.

Consistent and comprehensive uncertainty propagation in surrogate models enables more reliable approximation of expensive simulations and will therefore be useful in various fields of applications, such as surface or subsurface hydrology, fluid dynamics, or soil hydraulics.

How to cite: Guthke, A., Reiser, P., and Bürkner, P.-C.: Quantifying Uncertainty in Surrogate-based Bayesian Inference, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-5219,, 2024.

EGU24-6157 | ECS | Orals | HS3.9

Analyzing Groundwater Hazards with Sequential Monte Carlo  

Lea Friedli and Niklas Linde

Analyzing groundwater hazards frequently involves utilizing Bayesian inversions and estimating probabilities associated with rare events. A concrete example concerns the potential contamination of an aquifer, a process influenced by the unknown hydraulic properties of the subsurface. In this context, the emphasis shifts from the posterior distribution of model parameters to the distribution of a particular quantity of interest dependent on these parameters. To tackle the methodological hurdles at hand, we propose a Sequential Monte Carlo approach in two stages. The initial phase involves generating particles to approximate the posterior distribution, while the subsequent phase utilizes subset sampling techniques to evaluate the probability of the specific rare event of interest. Exploring a two-dimensional flow and transport example, we demonstrate the efficiency and accuracy of the developed PostRisk-SMC method in estimating rare event probabilities associated with groundwater hazards.

How to cite: Friedli, L. and Linde, N.: Analyzing Groundwater Hazards with Sequential Monte Carlo , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6157,, 2024.

EGU24-7610 | Posters on site | HS3.9

Parameter estimation of heterogeneous field in basin scale based on signal analysis and river stage tomography 

Bo-Tsen Wang, Chia-Hao Chang, and Jui-Pin Tsai

Understanding the spatial distribution of the aquifer parameters is crucial to evaluating the groundwater resources on a basin scale. River stage tomography (RST) is one of the potential methods to estimate the aquifer parameter fields. Utilizing the head variations caused by the river stage to conduct RST is essential to delineate the regional aquifer's spatial features successfully. However, the two external stimuli of the aquifer system, rainfall and river stage, are usually highly correlated, resulting in mixed features in the head observations, which may cause unreasonable estimates of parameter fields. Thus, separating the head variations sourced from rainfall and river stage is essential to developing the reference heads for RST. To solve this issue, we propose a systematic approach to extracting and reconstructing the head variations of river features from the original head observations during the flood periods and conducting RST. We utilized a real case study to examine the developed method. This study used the groundwater level data, rainfall data, and river stage data in the Zhuoshui River alluvial fan in 2006. The hydraulic diffusivity (D) values of five observation wells were used as the reference for parameter estimation. The results show that the RMSE of the D value is 0.027 (m2/s). The other three observation wells were selected for validation purposes, and the derived RMSE is 0.85(m2/s). The low RMSE reveals that the estimated D field can capture the characteristics of the regional aquifer. The results also indicate that the estimated D values derived from the developed method are consistent with the sampled D values from the pumping tests in the calibration and validation processes in the real case study. The results demonstrate that the proposed method can successfully extract and reconstruct the head variations of river features from the original head observations and can delineate the features of the regional parameter field. The proposed method can benefit RST studies and provide an alternative mixed-feature signal decomposition and reconstruction method.

How to cite: Wang, B.-T., Chang, C.-H., and Tsai, J.-P.: Parameter estimation of heterogeneous field in basin scale based on signal analysis and river stage tomography, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-7610,, 2024.

EGU24-7820 | Orals | HS3.9

Data-driven surrogate-based Bayesian model calibration for predicting vadose zone temperatures in drinking water supply pipes 

Ilja Kröker, Elisabeth Nißler, Sergey Oladyshkin, Wolfgang Nowak, and Claus Haslauer

Soil temperature and soil moisture in the unsaturated zone depend on each other and are influenced by non-stationary hydro-meteorological forcing factors that are subject to climate change. 

The transport of both heat and moisture are crucial for predicting temperatures in the shallow subsurface and, as consequence, around and in drinking water supply pipes. Elevated temperatures in water supply pipes (even up to 25°C and above) pose a risk to human health due to increased likelihood of microbial contamination. 

To model variably saturated flow and heat transport, a partial differential equation (PDE)-based integrated hydrogeological model has been developed and implemented in the DuMuX simulation framework.  This model integrates the hydrometeorological forcing functions via a novel interface condition at the atmosphere-subsurface boundary. Relevant soil properties and their dependency on temperatures have been measured as time series at a pilot site at the University of Stuttgart in detail since 2020. 

Despite these efforts on measurements and model enhancement, some uncertainties remain. These include capillary-saturation relationships in materials where they are difficult to measure, especially in the gravel-type materials that are commonly used above drinking water pipes. 

To enhance our understanding of the underlying physical processes, we employ Bayesian inference, which is a well-established approach to estimate uncertain or unknown model parameters. Computationally cheap surrogate models allow to overcome the limitations of Bayesian methods for computationally intensive models, when such surrogate models are used in lieu of the physical (PDE)-based model. Here, we use the arbitrary polynomial chaos expansion equipped with Bayesian regularization (BaPC).  The BaPC allows to exploit latest (Bayesian) active learning strategies to reduce the number of model-runs that are necessary for constructing the surrogate model.  

In the present work, we demonstrate the calibration of a PDE-based integrated hydrogeological model using Bayesian inference on a BaPC-based surrogate.  The accuracy of the calibrated and predicted temperatures in the shallow subsurface is then assessed against real-world measurement data. 

How to cite: Kröker, I., Nißler, E., Oladyshkin, S., Nowak, W., and Haslauer, C.: Data-driven surrogate-based Bayesian model calibration for predicting vadose zone temperatures in drinking water supply pipes, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-7820,, 2024.

EGU24-8007 | ECS | Orals | HS3.9

Investigating the divide and measure nonconformity  

Daniel Klotz, Martin Gauch, Frederik Kratzert, Grey Nearing, and Jakob Zscheischler

This contribution presents a diagnostic approach to investigate unexpected side effects that can occur during the evaluation of rainfall--runoff models.

The diagnostic technique that we use is based on the idea that one can use gradient descent to modify the runoff observations/simulations to obtain warranted observations/simulations. Specifically, we show how to use this concept to manipulate any hydrograph (e.g., a copy of the observations) so that it approximates specific NSE values for individual parts of the data. In short, we follow the following recipe to generate the synthetic simulations: (1) copy the observations, (2) add noise, (3) clip the modified discharge to zero, and (4) optimise the obtained simulation values by using gradient descent until a desired NSE value is reached.

To show how this diagnostic technique can be used we demonstrate a behaviour of Nash--Sutcliffe Efficiency (NSE) that appears when evaluating a model over subsets of the data: If models perform poorly for certain situations, this lack of performance is not necessarily reflected in the NSE (of the overall data). This behaviour follows from the definition of NSE and is therefore 100% explainable. However, from our experience it can be unexpected for many modellers. Our results also show that subdividing the data and evaluating over the resulting partitions yields different information regarding model deficiencies than an overall evaluation. We call this phenomenon the Divide And Measure Nonconformity or DAMN.

How to cite: Klotz, D., Gauch, M., Kratzert, F., Nearing, G., and Zscheischler, J.: Investigating the divide and measure nonconformity , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8007,, 2024.

Groundwater heads are commonly used to monitor storage of aquifers and as decision variables for groundwater management. Alluvial gravel aquifers are often characterized by high transmissivities and a corresponding strong seasonal and inter-annual variability of storage. The sustainable management of such aquifers is challenging, particularly for already tightly allocated aquifers and in increasingly extreme and potentially drier climates, and might require the restriction of groundwater abstraction for periods of time. Stakeholders require lead-in time to prepare for potential restrictions of their consented takes.

Groundwater models have been used in the past to support groundwater decision making and to provide the corresponding predictions of groundwater levels for operational forecasting and management. In this study, we benchmark and compare different model classes to perform this task: (i) a spatially explicit 3D groundwater flow model (MODFLOW), (ii) a conceptual, bucket-type Eigenmodel, (iii) a transfer-function model (TFN), and (iv) three machine learning (ML) techniques, namely, Multi-Layer Perceptron models (MLP), Long Short-Term Memory models (LSTM), and Random Forrest (RF) models. The model classes differ widely in their complexity, input requirements, calibration effort, and run-times. The different model classes are tested on four groundwater head time series taken from the Wairau Aquifer in New Zealand (Wöhling et al., 2020). Posterior parameter ensembles of MODFLOW (Wöhling et al., 2018) and the EIGENMODEL (Wöhling & Burbery, 2020) were combined with TFN and ML variants with different input features to form a (prior) multi-model ensemble. Models classes are ranked with posterior model weights derived from Bayesian model selection (BMS) and averaging (BMA) techniques.

Our results demonstrate that no “model that fits all” exists in our model set. The more physics-based MODFLOW model is not necessarily providing the most accurate predictions, but can provide physical meaning and interpretation for the entire model region and outputs at locations where no data is available. ML techniques have generally much lower input requirements and short run-times. They show to be competitive candidates for groundwater head predictions where observations are available, even for system states that lie outside the calibration data range.

Because the performance of model types is site-specific, we advocate the use of multi-model ensemble forecasting wherever feasible. The benefit is illustrated by our case study, with BMA uncertainty bounds providing a better coverage of the data and the BMA mean performing well for all tested sites. Redundant ensemble members (with BMA weights of zero) are easily filtered out to obtain efficient ensembles for operational forecasting.



Wöhling T, Burbery L (2020). Eigenmodels to forecast groundwater levels in unconfined river-fed aquifers during flow recession. Science of the Total Environment, 747, 141220, doi: 10.1016/j.scitotenv.2020.141220.

Wöhling, T., Gosses, M., Wilson, S., Wadsworth, V., Davidson, P. (2018). Quantifying river-groundwater interactions of New Zealand's gravel-bed rivers: The Wairau Plain. Goundwater doi:10.1111/gwat.12625

Wöhling T, Wilson SR, Wadsworth V, Davidson P. (2020). Detecting the cause of change using uncertain data: Natural and anthropogenic factors contributing to declining groundwater levels and flows of the Wairau Plain Aquifer, New Zealand. Journal of Hydrology: Regional Studies, 31, 100715, doi: 10.1016/j.ejrh.2020.100715.


How to cite: Wöhling, T. and Crespo Delgadillo, O.: Predicting groundwater heads in alluvial aquifers: Benchmarking different model classes and machine-learning techniques with BMA/S, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8818,, 2024.

EGU24-8872 | Orals | HS3.9

Characterization and modeling of large-scale aquifer systems under uncertainty: methodology and application to the Po River aquifer system 

Monica Riva, Andrea Manzoni, Rafael Leonardo Sandoval, Giovanni Michele Porta, and Alberto Guadagnini

Large-scale groundwater flow models are key to enhance our understanding of the potential impacts of climate and anthropogenic factors on water systems. Through these, we can identify significant patterns and processes that most affect water security. In this context, we have developed a comprehensive and robust theoretical framework and operational workflow that can effectively manage complex heterogeneous large-scale groundwater systems. We rely on machine learning techniques to map the spatial distribution of geomaterials within three-dimensional subsurface systems. The groundwater modeling approach encompasses (a) estimation of groundwater recharge and abstractions, as well as (b) appraisal of interactions among subsurface and surface water bodies. We ground our analysis on a unique dataset that encompasses lithostratigraphic data as well as piezometric and water extraction data across the largest aquifer system in Italy (the Po River basin). The quality of our results is assessed against pointwise information and hydrogeological cross-sections which are available within the reconstructed domain. These can be considered as soft information based on expert assessment. As uncertainty quantification is critical for subsurface characterization and assessment of future states of the groundwater system, the proposed methodology is designed to provide a quantitative evaluation of prediction uncertainty at any location of the reconstructed domain. Furthermore, we quantify the relative importance of uncertain model parameters on target model outputs through the implementation of a rigorous Global Sensitivity Analysis. By evaluating the spatial distribution of global sensitivity metrics associated with model parameters, we gain valuable insights into areas where the acquisition of future information could enhance the quality of groundwater flow model parameterization and improve hydraulic head estimates. The comprehensive dataset provided in this study, combined with the reconstruction of the subsurface system properties and piezometric head distribution and with the quantification of the associated uncertainty, can be readily employed in the context of groundwater availability and quality studies associated with the region of interest. The approach and operational workflow are flexible and readily transferable to assist identification of the main dynamics and patterns of large-scale aquifer systems of the kind here analyzed.

How to cite: Riva, M., Manzoni, A., Sandoval, R. L., Porta, G. M., and Guadagnini, A.: Characterization and modeling of large-scale aquifer systems under uncertainty: methodology and application to the Po River aquifer system, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8872,, 2024.

EGU24-10517 | Orals | HS3.9

Lock-ins and path dependency in evaluation metrics used for hydrological models 

Lieke Melsen, Arnald Puy, and Andrea Saltelli

Science, being conducted by humans, is inherently a social activity. This is evident in the development and acceptance of scientific methods. Science is not only socially shaped, but also driven (and in turn influenced) by technological development: technology can open up new research avenues. At the same time, it has been shown that technology can cause lock-ins and path dependency. A scientific activity driven both by social behavior and technological development is modelling. As such, studying modelling as a socio-technical activity can provide insights both in enculturation processes and in lock-ins and path dependencies. Even more, enculturation can lead to lock-ins. We will demonstrate this for the Nash-Sutcliffe Efficiency (NSE), a popular evaluation metric in hydrological research. Through a bibliometric analysis we show that the NSE is part of hydrological research culture and does not appear in adjacent research fields. Through a historical analysis we demonstrate the path dependency that has developed with the popularity of the NSE. Finally, through exploring the faith of alternative measures, we show the lock-in effect of the use of the NSE. As such, we confirm that the evaluation of models needs to take into account cultural embeddedness. This is relevant because peers' acceptance is a powerful legitimization argument to trust the model and/or model results, including for policy relevant applications. Culturally determined bias needs to be assessed for its potential consequences in the discipline. 

How to cite: Melsen, L., Puy, A., and Saltelli, A.: Lock-ins and path dependency in evaluation metrics used for hydrological models, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-10517,, 2024.

EGU24-10770 | Orals | HS3.9 | Highlight

Uncertainty and sensitivity analysis: new purposes, new users, new challenges 

Francesca Pianosi, Hannah Bloomfield, Gemma Coxon, Robert Reinecke, Saskia Salwey, Georgios Sarailidis, Thorsten Wagener, and Doris Wendt

Uncertainty and sensitivity analysis are becoming an integral part of mathematical modelling of earth and environmental systems. Uncertainty analysis aims at quantifying uncertainty in model outputs, which helps to avoid spurious precision and increase the trustworthiness of model-informed decisions. Sensitivity analysis aims at identifying the key sources of output uncertainty, which helps to set priorities for uncertainty reduction and model improvement.

In this presentation, we draw on a range of recent studies and projects to discuss the status of uncertainty and sensitivity analysis, focusing in particular on ‘global’ approaches, whereby uncertainties and sensitivities are quantified across the entire space of plausible variability of model inputs.

We highlight some of the challenges and untapped potential of these methodologies, including: (1) innovative ways to use global sensitivity analysis to test the ‘internal consistency’ of models and therefore support their diagnostic evaluation; (2) challenges and opportunities to promote the uptake of these methodologies to increasingly complex models, chains of models, and models used in industry; (3) the limits of uncertainty and sensitivity analysis when dealing with epistemic, poorly bounded or unquantifiable sources of uncertainties.

How to cite: Pianosi, F., Bloomfield, H., Coxon, G., Reinecke, R., Salwey, S., Sarailidis, G., Wagener, T., and Wendt, D.: Uncertainty and sensitivity analysis: new purposes, new users, new challenges, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-10770,, 2024.

EGU24-11414 | ECS | Posters on site | HS3.9

Single vs. multi-objective optimization approaches to calibrate an event-based conceptual hydrological model using model output uncertainty framework. 

Muhammad Nabeel Usman, Jorge Leandro, Karl Broich, and Markus Disse

Flash floods have become one of the major natural hazards in central Europe, and climate change projections indicate that the frequency and severity of flash floods will increase in many areas across the world and in central Europe. The complexity involved in the flash flood generation makes it difficult to calibrate a hydrological model for the prediction of such peak hydrological events. This study investigates the best approach to calibrate an event-based conceptual HBV model, comparing different trials of single-objective, single-event multi-objective (SEMO), and multi-event-multi-objective (MEMO) model calibrations. Initially, three trials of single-objective calibration are performed w.r.t. RMSE, NSE, and BIAS separately, then three different trials of multi-objective optimization, i.e., SEMO-3D (single-event three objectives), MEMO-3D (mean of three objectives from two events), and MEMO-6D (two events six objectives) are formulated. Model performance was validated for several peak events via 90 % (confidence interval) CI-based output uncertainty quantification. The uncertainties associated with the model predictions are estimated stochastically using the ‘relative errors (REs)’ between the simulated (Qsim) and measured (Qobs) discharges as a likelihood measure. Single-objective model calibration demonstrated that significant trade-offs exist between different objective functions, and no unique parameter set can optimize all objectives simultaneously. Compared to the solutions of single-objective calibration, all the multi-objective calibration formulations produced relatively accurate and robust results during both model calibration and validation phases. The uncertainty intervals associated with all the trials of single-objective calibration and the SEMO-3D calibration failed to capture observed peaks of the validation events. The uncertainty bands associated with the ensembles of Pareto solutions from the MEMO-3D and MEMO-6D (six-dimensional) calibrations displayed better performance in reproducing and capturing more significant peak validation events. However, to bracket peaks of large flash flood events within the prediction uncertainty intervals, the MEMO-6D optimization outperformed all the single-objective, SEMO-3D, and MEMO-3D multi-objective calibration methods. This study suggests that the MEMO_6D is the best approach for predicting large flood events with lower model output uncertainties when the calibration is performed with a better combination of peak events.

How to cite: Usman, M. N., Leandro, J., Broich, K., and Disse, M.: Single vs. multi-objective optimization approaches to calibrate an event-based conceptual hydrological model using model output uncertainty framework., EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11414,, 2024.

EGU24-12676 | ECS | Posters on site | HS3.9

Physics-Informed Ensemble Surrogate Modeling of Advective-Dispersive Transport Coupled with Film Intraparticle Pore Diffusion Model for Column Leaching Test 

Amirhossein Ershadi, Michael Finkel, Binlong Liu, Olaf Cirpka, and Peter Grathwohl

Column leaching tests are a common approach for evaluating the leaching behavior of contaminated soil and waste materials, which are often reused for various construction purposes. The observed breakthrough curves of the contaminants are affected by the intricate dynamics of solute transport, inter-phase mass transfer, and dispersion. Disentangling these interactions requires numerical models. However, inverse modeling and parameter sensitivity analysis are often time-consuming, especially when sorption/desorption kinetics are explicitly described by intra-particle diffusion, requiring the discretization along the column axis and inside the grains. To replace such computationally expensive models, we developed a machine-learning based surrogate model employing two disparate ensemble methods (stacking and weighted distance average) within the defined parameter range based on the German standard for column leaching tests. To optimize the surrogate model, adaptive sampling methods based on three distinct infill criteria are employed. These criteria include maximizing expected improvement, the Mahalanobis distance (exploitation), and maximizing standard deviation (exploration).
The stacking surrogate model makes use of extremely randomized trees and random forest as base- and meta-model. The model shows a very good performance in emulating the behavior of the original numerical model (Relative Root Mean Squared Error = 0.09). 
Our proposed surrogate model has been applied to estimate the complete posterior parameter distribution using Markov Chain Monte Carlo simulation. The impact of individual input parameters on the predictions generated by the surrogate model was analyzed using SHapley Additive exPlanations methods.

How to cite: Ershadi, A., Finkel, M., Liu, B., Cirpka, O., and Grathwohl, P.: Physics-Informed Ensemble Surrogate Modeling of Advective-Dispersive Transport Coupled with Film Intraparticle Pore Diffusion Model for Column Leaching Test, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12676,, 2024.

EGU24-13393 | ECS | Posters on site | HS3.9

Datasets and tools for local and global meteorological ensemble estimation 

Guoqiang Tang, Andrew Wood, Andrew Newman, Martyn Clark, and Simon Papalexiou

Ensemble gridded meteorological datasets are critical for driving hydrology and land models, enabling uncertainty analysis, and supporting a variety of hydroclimate research and applications. The Gridded Meteorological Ensemble Tool (GMET) has been a significant contributor in this domain, offering an accessible platform for generating ensemble precipitation and temperature datasets. The GMET methodology has continually evolved since its initial development in 2006, primarily in the form of a FORTRAN code base, and has since been utilized to generate historical and real-time ensemble meteorological (model forcing) datasets in the U.S. and part of Canada. A recent adaptation of GMET was used to produce multi-decadal forcing datasets for North America and the globe (EMDNA and EM-Earth, respectively). Those datasets have been used to support diverse hydrometeorological applications such as streamflow forecasting and hydroclimate studies across various scales. GMET has now evolved into a Python package called the Geospatial Probabilistic Estimation Package (GPEP), which offers methodological and technical enhancements relative to GMET. These include greater variable selection flexibility, intrinsic parallelization, and especially a broader suite of estimation methods, including the use of techniques from the scikit-learn machine learning library. GPEP enables a wider variety of strategies for local and global estimation of geophysical variables beyond traditional hydrological forcings.  This presentation summarizes GPEP and introduces major open-access ensemble datasets that have been generated with GMET and GPEP, including a new effort to create high-resolution (2 km) surface meteorological analyses for the US. These resources are useful in advancing hydrometeorological uncertainty analysis and geospatial estimation.

How to cite: Tang, G., Wood, A., Newman, A., Clark, M., and Papalexiou, S.: Datasets and tools for local and global meteorological ensemble estimation, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13393,, 2024.

We consider the optimal inference of spatially heterogeneous hydraulic conductivity and head fields based on three kinds of point measurements that may be available at monitoring wells: of head, permeability, and groundwater speed. We have developed a general, zonation-free technique for Monte Carlo (MC) study of field recovery problems, based on Karhunen-Loève (K-L) expansions of the unknown fields, whose coefficients are recovered by an analytical adjoint-state technique. This allows unbiased sampling from the space of all possible fields with a given correlation structure and efficient, automated gradient-descent calibration. The K-L basis functions have a straightforward notion of period, revealing the relationship between feature scale and reconstruction fidelity, and they have an a priori known spectrum, allowing for a non-subjective regularization term to be defined. We have performed automated MC calibration on over 1100 conductivity-head field pairs, employing a variety of point measurement geometries and quantified the mean-squared field reconstruction accuracy, both globally and as a function of feature scale.

We present heuristics for feature scale identification, examine global reconstruction error, and explore the value added by both groundwater speed measurements and by two different types of regularization. We show that significant feature identification becomes possible as feature scale exceeds four times measurement spacing and identification reliability subsequently improves in a power law fashion with increasing feature scale.

How to cite: Hansen, S. K., O'Malley, D., and Hambleton, J.: Feature scale and identifiability: quantifying the information that point hydraulic measurements provide about heterogeneous head and conductivity fields, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14219,, 2024.

EGU24-14805 | Orals | HS3.9

Sensitivity analysis of input variables of a SWAT hydrological model using the machine learning technique of random forest 

Ali Abousaeidi, Seyed Mohammad Mahdi Moezzi, Farkhondeh Khorashadi Zadeh, Seyed Razi Sheikholeslami, Albert Nkwasa, and Ann van Griensven

Sensitivity analysis of complex models, with a large number of input variables and parameters, is time-consuming and inefficient, using traditional approaches. Considering the capability of computing importance indices, the machine learning technique of the Random Forest (RF) is introduced as an alternative to conventional methods of sensitivity analysis. One of the advantages of using the RF model is the reduction of computational costs for sensitivity analysis.

The objective of this research is to analyze the importance of the input variables of a semi-distributed and physically-based hydrological model, namely SWAT (soil and water assessment tool) using the RF model. To this end, an RF-based model is first trained using SWAT input variables (such as, precipitation and temperature) and SWAT output variables (like streamflow and sediment load). Then, using the importance index of the RF model, the ranking of input variables, in terms of their impact on the accuracy of the model results, is determined. Additionally, the results of the sensitivity analysis are examined graphically. To validate the ranking results of the RF-based approach, the parameter ranking results of the Sobol G function, using the RF-based approach and the sensitivity analysis method of Sobol’ are compared. The ranking of the model input variables plays a significant role in the development of models and prioritizing efforts to reduce model errors.

Key words: Sensitivity analysis, model input variables, Machine learning technique, Random forest, SWAT model.

How to cite: Abousaeidi, A., Moezzi, S. M. M., Khorashadi Zadeh, F., Sheikholeslami, S. R., Nkwasa, A., and van Griensven, A.: Sensitivity analysis of input variables of a SWAT hydrological model using the machine learning technique of random forest, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14805,, 2024.

EGU24-16086 | ECS | Posters on site | HS3.9

Disentangling the role of different sources of uncertainty and model structural error on predictions of water and carbon fluxes with CLM5 for European observation sites 

Fernand Baguket Eloundou, Lukas Strebel, Bibi S. Naz, Christian Poppe Terán, Harry Vereecken, and Harrie-Jan Hendricks Franssen

The Community Land Model version 5 (CLM5) integrates processes encompassing the water, energy, carbon, and nitrogen cycles, and ecosystem dynamics, including managed ecosystems like agriculture. Nevertheless, the intricacy of CLM5 introduces predictive uncertainties attributed to factors such as input data, process parameterizations, and parameter values. This study conducts a comparative analysis between CLM5 ensemble simulations and eddy covariance and in-situ measurements, focusing on the effects of uncertain model parameters and atmospheric forcings on the water, carbon, and energy cycles.
Ensemble simulations for 14 European experimental sites were performed with the CLM5-BGC model, integrating the biogeochemistry component. In four perturbation experiments, we explore uncertainties arising from atmospheric forcing data, soil parameters, vegetation parameters, and the combined effects of these factors. The contribution of different uncertainty sources to total simulation uncertainty was analyzed by comparing the 99% confidence
intervals from ensemble simulations with measured terrestrial states and fluxes, using a three-way analysis of variance.
The study identifies that soil parameters primarily influence the uncertainty in estimating surface soil moisture, while uncertain vegetation parameters control the uncertainty in estimating evapotranspiration and carbon fluxes. A combination of uncertainty in atmospheric forcings and vegetation parameters mostly explains the uncertainty in sensible heat flux estimation. On average, the 99% confidence intervals envelope >40% of the observed fluxes, but this varies greatly between sites, exceeding 95% in some cases. For some sites, we could identify model structural errors related to model spin-up assumptions or erroneous plant phenology. The study guides identifying factors causing underestimation or overestimation in the variability of fluxes, such as crop parameterization or spin-up, and potential structural errors in point-scale simulations in CLM5.

How to cite: Eloundou, F. B., Strebel, L., Naz, B. S., Terán, C. P., Vereecken, H., and Hendricks Franssen, H.-J.: Disentangling the role of different sources of uncertainty and model structural error on predictions of water and carbon fluxes with CLM5 for European observation sites, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-16086,, 2024.

EGU24-16361 | ECS | Orals | HS3.9

Estimating prior distributions of TCE transformation rate constants from literature data 

Anna Störiko, Albert J. Valocchi, Charles Werth, and Charles E. Schaefer

Stochastic modeling of contaminant reactions requires the definition of prior distributions for the respective rate constants. We use data from several experiments reported in the literature to better understand the distribution of pseudo-first-order rate constants of abiotic TCE reduction in different sediments. These distributions can be used to choose informed priors for these parameters in reactive-transport models.

Groundwater contamination with trichloroethylene (TCE) persists at many hazardous waste sites due to back diffusion from low-permeability zones such as clay lenses. In recent years, the abiotic reduction of TCE by reduced iron minerals has gained attention as a natural attenuation process, but there is uncertainty as to whether the process is fast enough to be effective. Pseudo-first-order rate constants have been determined in laboratory experiments and are reported in the literature for various sediments and rocks, as well as for individual reactive minerals. However, rate constants can vary between sites and aquifer materials. Reported values range over several orders of magnitude.

To assess the uncertainty and variability of pseudo-first-order rate constants, we compiled data reported in several studies. We built a statistical model based on a hierarchical Bayesian approach to predict probability distributions of rate constants at new sites based on this data set. We then investigated whether additional information about the sediment composition at a site could reduce the uncertainty. We tested two sets of predictors: reactive mineral content or the extractable Fe(II) content. Knowing the reactive mineral content reduced the uncertainty only slightly. In contrast, knowing the Fe(II) content greatly reduced the uncertainty because the relationship between Fe(II) content and rate constants is approximately log-log-linear. Using a simple example of diffusion-controlled transport in a contaminated aquitard, we show how the uncertainty in the predicted rate constants affects the predicted remediation times.

How to cite: Störiko, A., Valocchi, A. J., Werth, C., and Schaefer, C. E.: Estimating prior distributions of TCE transformation rate constants from literature data, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-16361,, 2024.

Deeper insights on internal model behaviors are essential as hydrological models are becoming more and more complex. Our study provides a framework which combines the time-varying global sensitivity analyses with data mining techniques to unravel the process-level behavior of high-complexity models and tease out the main information. The extracted information is further used to assist parameter identification. The physically-based Distributed Hydrology-Soil-Vegetation Model (DHSVM) set up in a mountainous watershed is used as a case study. Specifically, a two-step GSA including time-aggregated and time-variant approaches are conducted to address the problem of high parameter dimensionality and characterize the time-varying parameter importance. As we found difficulties in interpreting the long-term complicated dynamics, a clustering operation is performed to partition the entire period into several clusters and extract the corresponding temporal parameter importance patterns. Finally, the clustered time clusters are utilized in parameterization, where each parameter is identified in their dominant times. Results are summarized as follows: (1) importance of selected soil and vegetation parameters varies greatly throughout the period; (2) typical patterns of parameter importance corresponding to flood, very short dry-to-wet, fast recession and continuous dry periods are successfully distinguished. We argue that somewhere between “total period” and “continuous discrete time” can be more useful for understanding and interpretation; (3) parameters dominant for short times are much more identifiable when they are identified in dominance time cluster(s); (4) the enhanced parameter identifiability overall improves the model performance according to the metrics of NSE, LNSE, and RMSE, suggesting that the use of GSA information has the potential to provide a better search for optimal parameter sets.

How to cite: Wang, L., Xu, Y., Gu, H., and Liang, X.: Investigating dynamic parameter importance of a high-complexity hydrological model and implications for parameterization, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18569,, 2024.

EGU24-18804 | ECS | Orals | HS3.9

Accelerating Hydrological Model Inversion: A Multilevel Approach to GLUE 

Max Rudolph, Thomas Wöhling, Thorsten Wagener, and Andreas Hartmann

Inverse problems play a pivotal role in hydrological modelling, particularly for parameter estimation and system understanding, which are essential for managing water resources. The application of statistical inversion methodologies such as Generalized Likelihood Uncertainty Estimation (GLUE) is often obstructed, however, by high model computational cost given that Monte Carlo sampling strategies often return a very small fraction of behavioural model runs. There is a need, however, to balance this aspect with the demand for broadly sampling the parameter space. Especially relevant for spatially distributed or (partial) differential equation based models, this aspect calls for computationally efficient methods of statistical inference that approximate the “true” posterior parameter distribution well. Our study introduces multilevel GLUE (MLGLUE), which effectively mitigates these computational challenges by exploiting a hierarchy of models with different computational grid resolutions (i.e., spatial or temporal discretisation), inspired by multilevel Monte Carlo strategies. Starting with low-resolution models, MLGLUE only passes parameter samples to higher-resolution models for evaluation if associated with a high likelihood, which poses a large potential for substantial computational savings. We demonstrate the applicability of the approach using a groundwater flow model with a hierarchy of different spatial resolutions. With MLGLUE, the computation time of parameter inference could be reduced by more than 60% compared to GLUE, while the resulting posterior distributions are virtually identical. Correspondingly, the uncertainty estimates of MLGLUE and GLUE are also very similar. Considering the simplicity of the implementation as well as its efficiency, MLGLUE promises to be an alternative for statistical inversion of computationally costly hydrological models.

How to cite: Rudolph, M., Wöhling, T., Wagener, T., and Hartmann, A.: Accelerating Hydrological Model Inversion: A Multilevel Approach to GLUE, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18804,, 2024.

EGU24-19966 | Orals | HS3.9

Operational Sensitivity Analysis for Flooding in Urban Systems under Uncertainty 

Aronne Dell'Oca, Monica Riva, Alberto Guadagnini, and Leonardo Sandoval

The runoff process in environmental systems is influenced by various variables that are typically are affected by uncertainty. These include, for example, climate and hydrogeological quantities (hereafter denoted as environmental variables). Additionally, the runoff process is influenced by quantities that are amenable to intervention/design (hereafter denoted as operational variables) and can therefore be set to desired values on the basis of specific management choices. A key question in this context is: How do we discriminate the impact of operational variables, whose values can be decided in the system design or management phase, on system outputs considering also the action of uncertainty associated with environmental variables? We tackle this issue upon introducing a novel approach which we term Operational Sensitivity Analysis (OSA) and set within a Global Sensitivity Analysis (GSA) framework. OSA enables us to assess the sensitivity of a given model output specifically to operational factors, while recognizing uncertainty in the environmental variables. This approach is developed as a complement to a traditional GSA, which does not differentiate at the methodological level the nature of the type of variability associated with operational or environmental variables.

We showcase our OSA approach through an exemplary scenario associated with a urban catchment where flooding results from sewer system failure. In this context, we distinguish between operational variables, such as sewer system pipe properties and urban area infiltration capacity, and environmental variables such as, urban catchment drainage properties and rain event characteristics. Our results suggest that the diameter of a set of pipes in the sewer network is the most influential operational variable. As such, it provides a rigorous basis upon which one could plan appropriate actions to effectively manage the system response.

How to cite: Dell'Oca, A., Riva, M., Guadagnini, A., and Sandoval, L.: Operational Sensitivity Analysis for Flooding in Urban Systems under Uncertainty, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19966,, 2024.

EGU24-20013 | ECS | Orals | HS3.9

Field-scale soil moisture predictions using in situ sensor measurements in an inverse modelling framework: SWIM² 

Marit Hendrickx, Jan Diels, Jan Vanderborght, and Pieter Janssens

With the rise of affordable, autonomous sensors and IoT (Internet-of-Things) technology, it is possible to monitor soil moisture in a field online and in real time. This offers opportunities for real-time model calibration for irrigation scheduling. A framework is presented where realtime sensor data are coupled with a soil water balance model to predict soil moisture content and irrigation requirements at field scale. SWIM², Sensor Wielded Inverse Modelling of a Soil Water Irrigation Model, is a framework based on the DREAM inverse modelling approach to estimate 12 model parameters (soil and crop growth parameters) and their uncertainty distribution. These parameter distributions result in soil moisture predictions with a prediction uncertainty estimate, which enables a farmer to anticipate droughts and estimate irrigation requirements.

The SWIM² framework was validated based on three growing seasons (2021-2023) in about 30 fields of vegetable growers in Flanders. Kullback–Leibler divergence (KLD) was used as a metric to quantify information gain of the model parameters starting from non-informative priors. Performance was validated in two steps, i.e. the calibration period and prediction period, which is in correspondence with the real-world implementation of the framework. The RMSE, correlation (R, NSE) and Kling-Gupta efficiency (KGE) of soil moisture were analyzed in function of time, i.e. the amount of available sensor data for calibration.

Soil moisture can be predicted accurately after 10 to 20 days of sensor data is available for calibration. The RMSE during the calibration period is generally around 0.02 m³/m³, while the RMSE during the prediction period decreases from 0.04 to 0.02 m³/m³ when more calibration data is available. Information gain (KLD) of some parameters (e.g. field capacity and curve number) largely depends on the presence of dynamic events (e.g. precipitation events) during the calibration period. After 40 days of sensor data, the KGE and Pearson correlation of the calibration period become stable with median values of 0.8 and 0.9, respectively. For the validation period, the KGE and Pearson correlation are increasing in time, with median values from 0.3 to 0.7 (KGE) and from 0.7 to 0.95 (R). These good results show that, with this framework, we can simulate and predict soil moisture accurately. These predictions can in turn be used to estimate irrigation requirements.

Precipitation radar data was primarily considered as an input without uncertainty. As an extension, precipitation forcing error can be treated in DREAM by applying rainfall multipliers as additional parameters that are estimated in the inverse modelling framework. The multiplicative error of the radar data was quantified by comparison of radar data to rain gauge measurements. The prior uncertainty of the logarithmic multipliers was described by a Laplace distribution and was implemented in DREAM. The extended framework with rainfall multipliers shows better convergence and acceptance rate compared to the main framework. The calibration period shows better performance with higher correlations and lower RMSE values, but a decrease in performance was found for the validation period. These results suggest that the implementation of rainfall multipliers leads to overfitting, resulting in lower predictive power.

How to cite: Hendrickx, M., Diels, J., Vanderborght, J., and Janssens, P.: Field-scale soil moisture predictions using in situ sensor measurements in an inverse modelling framework: SWIM², EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20013,, 2024.

In recent years, Machine Learning (ML) models have led to a substantial improvement in hydrological predictions. It appears these models can distill information from catchment properties that is relevant for the relationship between meteorological drivers and streamflow, which has so far eluded hydrologists.
In the first part of this talk, I shall demonstrate some of our attempts towards understanding these improvements. Utilising Autoencoders and intrinsic dimension estimators, we have shown that the wealth of available catchment properties can effectively be summarised into merely three features, insofar as they are relevant for streamflow prediction. Hybrid models, which combine the flexibility of ML models with mechanistic mass-balance models, are equally adept at predicting as pure ML models but come with only a few interpretable interior states. Combining these findings will, hopefully, bring us closer to understanding what these ML models seem to have 'grasped'.
In the second part of the talk, I will address the issue of uncertainty quantification. I contend that error modelling should not be attempted on the residuals. Rather, we should model the errors where they originate, i.e., on the inputs, model states, and/or parameters. Such stochastic models are more adept at expressing the intricate distributions exhibited by real data. However, they come at the cost of a very large number of unobserved latent variables and thus pose a high-dimensional inference problem. This is particularly pertinent when our models include ML components. Fortunately, advances in inference algorithms and parallel computing infrastructure continue to extend the limits on the number of variables that can be inferred within a reasonable timeframe. I will present a straightforward example of a stochastic hydrological model with input uncertainty, where Hamiltonian Monte Carlo enables a comprehensive Bayesian inference of model parameters and the actual rain time-series simultaneously.

How to cite: Albert, C.: Advances and prospects in hydrological (error) modelling, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20170,, 2024.

EGU24-262 | Orals | HS3.5

Differentiable modeling for global water resources under global change 

Chaopeng Shen, Yalan Song, Farshid Rahmani, Tadd Bindas, Doaa Aboelyazeed, Kamlesh Sawadekar, Martyn Clark, and Wouter Knoben

Process-based modeling offers interpretability and physical consistency in many domains of geosciences but struggles to leverage large datasets efficiently. Machine-learning methods, especially deep networks, have strong predictive skills yet are unable to answer specific scientific questions. A recently proposed genre of physics-informed machine learning, called “differentiable” modeling (DM,, trains neural networks (NNs) with process-based equations (priors) together in one stage (so-called “end-to-end”) to benefit from the best of both NNs and process-based paradigms. The NNs do not need target variables for training but can be indirectly supervised by observations matching the outputs of the combined model, and differentiability critically supports learning from big data. We propose that differentiable models are especially suitable as global hydrologic models because they can harvest information from big earth observations to produce state-of-the-art predictions (, enable physical interpretation naturally, extrapolate well (due to physical constraints) in space and time, enforce known physical laws and sensitivities, and leverage progress in modern AI computing architecture and infrastructure. Differentiable models can also synergize with existing global hydrologic models (GHMs) and learn from the lessons of the community. Differentiable GHMs to answer pressing societal questions on water resources availability, climate change impact assessment, water management, and disaster risk mitigation, among others. We demonstrate the power of differentiable modeling using computational examples in rainfall-runoff modeling, river routing, forcing fusion, as well applications in water-related domains such as ecosystem modeling and water quality modeling. We discuss how to address potential challenges such as implementing gradient tracking for implicit numerical schemes and addressing process tradeoffs. Furthermore, we show how differentiable modeling can enable us to ask fundamental questions in hydrologic sciences and get robust answers from big global data.

How to cite: Shen, C., Song, Y., Rahmani, F., Bindas, T., Aboelyazeed, D., Sawadekar, K., Clark, M., and Knoben, W.: Differentiable modeling for global water resources under global change, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-262,, 2024.

Streamflow can be affected by numerous factors, such as solar radiation, underlying surface conditions, and atmospheric circulation which results in nonlinearity, uncertainty, and randomness in streamflow time series. Diverse conventional and Deep Learning (DL) models have been applied to recognize the complex patterns and discover nonlinear relationships in the hydrological time series and incorporating multi-variables in deep learning can match or improve streamflow forecasts and hopes to improve extreme value predictions. Multivariate approaches surpass univariate ones by including additional time series as explanatory variables. Deep neural networks (DNNs) excel in multi-horizon time series forecasting, outperforming classical models. However, determining the relative contribution of each variable in streamflow remains challenging due to the black-box nature of DL models.


We propose utilizing the advanced Temporal Fusion Transformers (TFT) deep-learning technique to model streamflow values across various temporal scales, incorporating multiple variables. TFT's attention-based architecture enables high-performance multi-horizon forecasting with interpretable insights into temporal dynamics. Additionally, the model identifies the significance of each input variable, recognizes persistent temporal patterns, and highlights extreme events. Despite its application in a few studies across different domains, the full potential of this model remains largely unexplored. The study focused on Sundargarh, an upper catchment of the Mahanadi basin in India, aiming to capture pristine flow conditions. QGIS was employed to delineate the catchment, and daily streamflow data from 1982 to 2020 were obtained from the Central Water Commission. Input variables included precipitation, potential evaporation, temperature, and soil water volume at different depths. Precipitation and temperature datasets were obtained from India Meteorological Department (IMD) datasets, while other variables were sourced from the ECMWF fifth-generation reanalysis (ERA-5). Hyperparameter tuning was conducted using the Optuna optimization framework, known for its efficiency and easy parallelization. The model trained using quantile loss function with different combinations of quantiles, demonstrated superior performance with upper quantiles. Evaluations using R2 and NSE indicated good performance in monthly streamflow predictions for testing sets, particularly in confidently predicting low and medium flows. While peak flows were well predicted at certain timesteps, there were instances of underperformance. Unlike other ML algorithms, TFT can learn seasonality and lag analysis patterns directly from raw training data, including the identification of crucial variables. The model underwent training for different time periods, checking for performance improvement with increased length of data. To gain a better understanding of how distinct sub-processes affect streamflow patterns at various time scales, the model was applied at pentad and daily scales. Evaluation at extreme values prompted an investigation into improving predictions through quantile loss function adjustments. Given the computational expense of daily streamflow forecasting using TFT with multiple variables, parallel computing is employed. Results demonstrated considerable accuracy, but validating TFT's interpretive abilities require testing alternative ML models.


How to cite: Mohan, M. and Kumar D, N.: Multivariate multi-horizon streamflow forecasting for extremes and their interpretation using an explainable deep learning architecture, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-451,, 2024.

EGU24-2211 | ECS | Posters on site | HS3.5

Staged Learning in Physics-Informed Neural Networks to Model Contaminant Transport under Parametric Uncertainty 

Milad Panahi, Giovanni Porta, Monica Riva, and Alberto Guadagnini

Addressing the complexities of groundwater modeling, especially under the veil of uncertain physical parameters and limited observational data, poses significant challenges. This study introduces an approach using Physics-Informed Neural Network (PINN) framework to unravel these uncertainties. Termed PINN under uncertainty, PINN-UU, adeptly integrates uncertain parameters within spatio-temporal domains, focusing on hydrological systems. This approach, exclusively built on underlying physical equations, leverages a staged training methodology, effectively navigating high-dimensional solution spaces. We demonstrate our approach through application of reactive transport modeling in porous media, a problem setting relevant to contaminant transport in soil and groundwater. PINN-UU shows promising capabilities in enhancing model reliability and efficiency, and in conducting sensitivity analysis. Our approach is designed to be accessible and engaging, offering insightful contributions to environmental engineering, and hydrological modeling. It represents a step toward deciphering complex geohydrological systems, with broad implications for resource management and environmental science.

How to cite: Panahi, M., Porta, G., Riva, M., and Guadagnini, A.: Staged Learning in Physics-Informed Neural Networks to Model Contaminant Transport under Parametric Uncertainty, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2211,, 2024.

EGU24-2850 | ECS | Orals | HS3.5

Development of a Distributed Physics-informed Deep Learning Hydrological Model for Data-scarce Regions 

Liangjin Zhong, Huimin Lei, and JIngjing Yang

Climate change has exacerbated water stress and water-related disasters, necessitating more precise runoff simulations. However, in the majority of global regions, a deficiency of runoff data constitutes a significant constraint on modeling endeavors. Traditional distributed hydrological models and regionalization approaches have shown suboptimal performance. While current data-driven models trained on large datasets excel in spatial extrapolation, the direct applicability of these models in certain regions with unique hydrological processes may be challenging due to the limited representativeness within the training dataset. Furthermore, transfer learning deep learning models pre-trained on large datasets still necessitate local data for retraining, thereby constraining their applicability. To address these challenges, we present a physics-informed deep learning model based on a distributed framework. It involves spatial discretization and the establishment of differentiable hydrological models for discrete sub-basins, coupled with a differentiable Muskingum method for channel routing. By introducing upstream-downstream relationships, model errors in sub-basins propagate through the river network to the watershed outlet, enabling the optimization using limited downstream runoff data, thereby achieving spatial simulation of ungauged internal sub-basins. The model, when trained solely on the downstream-most station, outperforms the distributed hydrological model in runoff simulation at both the training station and upstream stations, as well as evapotranspiration spatial patterns. Compared to transfer learning, our model requires less training data, yet achieves higher precision in simulating runoff on spatially hold-out stations and provides more accurate estimates of spatial evapotranspiration. Consequently, this model offers a novel approach to hydrological simulation in data-scarce regions with unique processes.

How to cite: Zhong, L., Lei, H., and Yang, J.: Development of a Distributed Physics-informed Deep Learning Hydrological Model for Data-scarce Regions, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2850,, 2024.

EGU24-3028 | Orals | HS3.5 | Highlight

Spatial sensitivity of river flooding to changes in climate and land cover through explainable AI 

Louise Slater, Gemma Coxon, Manuela Brunner, Hilary McMillan, Le Yu, Yanchen Zheng, Abdou Khouakhi, Simon Moulds, and Wouter Berghuijs

Explaining the spatially variable impacts of flood-generating mechanisms is a longstanding challenge in hydrology, with increasing and decreasing temporal flood trends often found in close regional proximity. Here, we develop a machine learning-informed approach to unravel the drivers of seasonal flood magnitude and explain the spatial variability of their effects in a temperate climate. We employ 11 observed meteorological and land cover time series variables alongside 8 static catchment attributes to model flood magnitude in 1268 catchments across Great Britain over four decades. We then perform a sensitivity analysis to understand how +10% precipitation, +1°C air temperature, or +10 percentage points of urbanisation or afforestation affect flood magnitude in catchments with varying characteristics. Our simulations show that increasing precipitation and urbanisation both tend to amplify flood magnitude significantly more in catchments with high baseflow contribution and low runoff ratio, which tend to have lower values of specific discharge on average. In contrast, rising air temperature (in the absence of changing precipitation) decreases flood magnitudes, with the largest effects in dry catchments with low baseflow index. Afforestation also tends to decrease floods more in catchments with low groundwater contribution, and in dry catchments in the summer. These reported associations are significant at p<0.001. Our approach may be used to further disentangle the joint effects of multiple flood drivers in individual catchments.

How to cite: Slater, L., Coxon, G., Brunner, M., McMillan, H., Yu, L., Zheng, Y., Khouakhi, A., Moulds, S., and Berghuijs, W.: Spatial sensitivity of river flooding to changes in climate and land cover through explainable AI, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-3028,, 2024.

EGU24-4105 | ECS | Orals | HS3.5

Global flood projection and socioeconomic implications under a physics-constrained deep learning framework 

Shengyu Kang, Jiabo Yin, Louise Slater, Pan Liu, and Dedi Liu

As the planet warms, the frequency and severity of weather-related hazards such as floods are intensifying, posing substantial threats to communities around the globe. Rising flood peaks and volumes can claim lives, damage infrastructure, and compromise access to essential services. However, the physical mechanisms behind global flood evolution are still uncertain, and their implications for socioeconomic systems remain unclear. In this study, we leverage a supervised machine learning technique to identify the dominant factors influencing daily streamflow. We then propose a physics-constrained cascade model chain which assimilates water and heat transport processes to project bivariate risk (i.e. flood peak and volume together), along with its socioeconomic consequences. To achieve this, we drive a hybrid deep learning-hydrological model with bias-corrected outputs from twenty global climate models (GCMs) under four shared socioeconomic pathways (SSPs). Our results project considerable increases in flood risk under the medium to high-end emission scenario (SSP3-7.0) over most catchments of the globe. The median future joint return period decreases from 50 years to around 27.6 years, with 186 trillion dollars and 4 billion people exposed. Downwelling shortwave radiation is identified as the dominant factor driving changes in daily streamflow, accelerating both terrestrial evapotranspiration and snowmelt. As future scenarios project enhanced radiation levels along with an increase in precipitation extremes, a heightened risk of widespread flooding is foreseen. This study aims to provide valuable insights for policymakers developing strategies to mitigate the risks associated with river flooding under climate change.

How to cite: Kang, S., Yin, J., Slater, L., Liu, P., and Liu, D.: Global flood projection and socioeconomic implications under a physics-constrained deep learning framework, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4105,, 2024.

EGU24-4238 | ECS | Posters on site | HS3.5

Letting neural networks talk: exploring two probabilistic neural network models for input variable selection 

John Quilty and Mohammad Sina Jahangir

Input variable selection (IVS) is an integral part of building data-driven models for hydrological applications. Carefully chosen input variables enable data-driven models to discern relevant patterns and relationships within data, improving their predictive accuracy. Moreover, the optimal choice of input variables can enhance the computational efficiency of data-driven models, reduce overfitting, and contribute to a more interpretable and parsimonious model. Meanwhile, including irrelevant and/or redundant input variables can introduce noise to the model and hinder its generalization ability.

Three probabilistic IVS methods, namely Edgeworth approximation-based conditional mutual information (EA), double-layer extreme learning machine (DLELM), and gradient mapping (GM), were used for IVS and then coupled with a long short-term memory (LSTM)-based probabilistic deep learning model for daily streamflow prediction. While the EA method is an effective IVS method, DLELM and GM are examples of probabilistic neural network-based IVS methods that have not yet been explored for hydrological prediction. DLELM selects input variables through sparse Bayesian learning, pruning both input and output layer weights of a committee of neural networks. GM is based on saliency mapping, an explainable AI technique commonly used in computer vision that can be coupled with probabilistic neural networks. Both DLELM and GM involve randomization during parameter initialization and/or training thereby introducing stochasticity into the IVS procedure, which has been shown to improve the predictive performance of data-driven models.

The IVS methods were coupled with a LSTM-based probabilistic deep learning model and applied to a streamflow prediction case study using 420 basins spread across the continental United States. The dataset includes 37 candidate input variables derived from the daily-averaged ERA-5 reanalysis data.

Comparing the most frequently selected input variables by EA, DLELM, and GM across the 420 basins revealed that all three models select a similar number of input variables. For example, the top 15 input variables selected by all methods included nine variables that were similar.

The input variables selected by EA, DLELM, and GM were then used in the LSTM-based probabilistic deep learning models for streamflow prediction across the 420 basins. The probabilistic deep learning models were developed and optimized using the top 10 variables selected by each IVS method. The results were compared to a benchmark scenario that used all 37 ERA-5 variables in the prediction model. Overall, the findings show that the GM method results in higher prediction accuracy (Kling-Gupta efficiency; KGE) compared to the other two IVS methods. A median KGE of 0.63 was obtained for GM, whereas for the EA, DLELM, and all input variables’ scenario, KGE scores of 0.61, 0.60, and 0.62 were obtained, respectively.

DLELM and GM are two AI-based techniques that introduce elements of interpretability and stochasticity to the IVS process. The results of the current study are expected to contribute to the evolving landscape of data-driven hydrological modeling by introducing hitherto unexplored neural network-based IVS to pursue more parsimonious, efficient, and interpretable probabilistic deep learning models.

How to cite: Quilty, J. and Jahangir, M. S.: Letting neural networks talk: exploring two probabilistic neural network models for input variable selection, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4238,, 2024.

EGU24-4325 | ECS | Posters on site | HS3.5

Towards learning human influences in a highly regulated basin using a hybrid DL-process based framework 

Liangkun Deng, Xiang Zhang, and Louise Slater

Hybrid models have shown impressive performance for streamflow simulation, offering better accuracy than process-based hydrological models (PBMs) and superior interpretability than deep learning models (DLMs). A recent paradigm for streamflow modeling, integrating DLMs and PBMs within a differentiable framework, presents considerable potential to match the performance of DLMs while simultaneously generating untrained variables that describe the entire water cycle. However, the potential of this framework has mostly been verified in small and unregulated headwater basins and has not been explored in large and highly regulated basins. Human activities, such as reservoir operations and water transfer projects, have greatly changed natural hydrological regimes. Given the limited access to operational water management records, PBMs generally fail to achieve satisfactory performance and DLMs are challenging to train directly. This study proposes a coupled hybrid framework to address these problems. This framework is based on a distributed PBM, the Xin'anjiang (XAJ) model, and adopts embedded deep learning neural networks to learn the physical parameters and replace the modules of the XAJ model reflecting human influences through a differentiable structure. Streamflow observations alone are used as training targets, eliminating the need for operational records to supervise the training process. The Hanjiang River basin (HRB), one of the largest subbasins of the Yangtze River basin, disturbed by large reservoirs and national water transfer projects, is selected to test the effectiveness of the framework. The results show that the hybrid framework can learn the best parameter sets of the XAJ model depicting natural and human influences to improve streamflow simulation. It performs better than a standalone XAJ model and achieves similar performance to a standalone LSTM model. This framework sheds new light on assimilating human influences to improve simulation performance in disturbed river basins with limited operational records.

How to cite: Deng, L., Zhang, X., and Slater, L.: Towards learning human influences in a highly regulated basin using a hybrid DL-process based framework, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4325,, 2024.

EGU24-4768 | ECS | Orals | HS3.5

HydroPML: Towards Unified Scientific Paradigms for Machine Learning and Process-based Hydrology 

Qingsong Xu, Yilei Shi, Jonathan Bamber, Ye Tuo, Ralf Ludwig, and Xiao Xiang Zhu

Accurate hydrological understanding and water cycle prediction are crucial for addressing scientific and societal challenges associated with the management of water resources, particularly under the dynamic influence of anthropogenic climate change. Existing work predominantly concentrates on the development of machine learning (ML) in this field, yet there is a clear distinction between hydrology and ML as separate paradigms. Here, we introduce physics-aware ML as a transformative approach to overcome the perceived barrier and revolutionize both fields. Specifically, we present a comprehensive review of the physics-aware ML methods, building a structured community (PaML) of existing methodologies that integrate prior physical knowledge or physics-based modeling into ML. We systematically analyze these PaML methodologies with respect to four aspects: physical data-guided ML, physics-informed ML, physics-embedded ML, and physics-aware hybrid learning. PaML facilitates ML-aided hypotheses, accelerating insights from big data and fostering scientific discoveries. We initiate a systematic exploration of hydrology in PaML, including rainfall-runoff and hydrodynamic processes, and highlight the most promising and challenging directions for different objectives and PaML methods. Finally, a new PaML-based hydrology platform, termed HydroPML, is released as a foundation for applications based on hydrological processes [1]. HydroPML presents a range of hydrology applications, including but not limited to rainfall-runoff-inundation modeling, real-time flood forecasting (FloodCast), rainfall-induced landslide forecasting (LandslideCast), and cutting-edge PaML methods, to enhance the explainability and causality of ML and lay the groundwork for the digital water cycle's realization. The HydroPML platform is publicly available at

[1] Xu, Qingsong, et al. "Physics-aware Machine Learning Revolutionizes Scientific Paradigm for Machine Learning and Process-based Hydrology." arXiv preprint arXiv:2310.05227 (2023).

How to cite: Xu, Q., Shi, Y., Bamber, J., Tuo, Y., Ludwig, R., and Zhu, X. X.: HydroPML: Towards Unified Scientific Paradigms for Machine Learning and Process-based Hydrology, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4768,, 2024.

EGU24-6378 | ECS | Posters on site | HS3.5

Seasonal forecasts of hydrological droughts over the Alps: advancing hybrid modelling applications 

Iacopo F. Ferrario, Mariapina Castelli, Alasawedah M. Hussein, Usman M. Liaqat, Albrecht Weerts, and Alexander Jacob

The Alpine region is often called the Water Tower of Europe, alluding to its water richness and its function of supplying water through several important European rivers flowing well beyond its geographical boundaries. Climate change projections show that the region will likely experience rising temperatures and changes in precipitation type, frequency, and intensity, with consequences on the spatiotemporal pattern of water availability. Seasonal forecasts could supply timely information for planning water allocation a few months in advance, reducing potential conflicts under conditions of scarce water resources. The overall goal of this study is to improve the seasonal forecasts of hydrological droughts over the entire Alpine region at a spatial resolution (~1 km) that matches the information need by local water agencies, e.g., resolving headwaters and small valleys. In this study we present the progress on the following key objectives:

  • Improving the estimation of distributed model (Wflow_sbm) parameters by finding the optimal transfer function from geophysical attributes to model parameters and upscaling the information to model resolution.
  • Combining physical-hydrological knowledge with data-driven (ML/DL) techniques for improving accuracy and computational performance, without compromising on interpretation
  • Integrating EO-based hydrological fluxes, like streamflow, surface soil moisture, actual evapotranspiration, and snow waters equivalent, with the aim of regularizing the calibration/training, tackling the problem of model parameters equifinality.

Our work is part of the InterTwin project that aims at developing a multi-domain Digital Twin blueprint architecture and implementation platform. We build on the technological solutions developed in InterTwin (e.g. openEO, CWL and STAC) and fully embrace its inspiring principles of open science, reproducibility, and interoperability of data and methods.

How to cite: Ferrario, I. F., Castelli, M., Hussein, A. M., Liaqat, U. M., Weerts, A., and Jacob, A.: Seasonal forecasts of hydrological droughts over the Alps: advancing hybrid modelling applications, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6378,, 2024.

EGU24-6656 | ECS | Orals | HS3.5

Exploring Catchment Regionalization through the Eyes of HydroLSTM 

Luis De La Fuente, Hoshin Gupta, and Laura Condon

Regionalization is an issue that hydrologists have been working on for decades. It is used, for example, when we transfer parameters from one calibrated model to another, or when we identify similarities between gauged to ungauged catchments. However, there is still no unified method that can successfully transfer parameters and identify similarities between different regions while accounting for differences in meteorological forcing, catchment attributes, and hydrological responses.

Machine learning (ML) has shown promising results in the generalization of its results at temporal and spatial scales for streamflow prediction. This suggests that ML models have learned useful regionalization relationships that we could extract. This study explores how the HydroLSTM representation, a modification of traditional Long Short-Term Memory, can learn meaningful relationships between meteorological forcing and catchment attributes. One promising feature of the HydroLSTM representation is that the learned patterns can generate different hydrological responses across the US. These findings indicate that we can learn more about regionalization by studying ML models.

How to cite: De La Fuente, L., Gupta, H., and Condon, L.: Exploring Catchment Regionalization through the Eyes of HydroLSTM, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6656,, 2024.

EGU24-6965 | ECS | Posters on site | HS3.5

A Machine Learning Based Snow Cover Parameterization  for Common Land Model (CoLM)  

Han Zhang, Lu Li, and Yongjiu Dai

Accurate representation of snow cover fraction (SCF) is vital for terrestrial simulation, as it significantly affects surface albedo and land surface radiation. In land models, SCF is parameterized using snow water equivalent and snow depth. This study introduces a novel machine learning-based parameterization, which incorporates the light-GBM regression algorithm and additional input features: surface air temperature, humidity, leaf area index, and the standard deviation of topography. The regression model is trained with input features from the Common Land Model (CoLM) simulations and the labels from the Moderate Resolution Imaging Spectroradiometer (MODIS) observations on a daily scale. Offline verification indicates significant improvements for the new scheme over multiple traditional parameterizations.

Moreover, this machine learning-based parameterization has been online coupled with the CoLM using the Message Passing Interface (MPI). In online simulations, it substantially outperforms the widely used Niu and Yang (2007) scheme, improving the root mean square errors and temporal correlations of SCF on 80% of global grids. Additionally, associated land surface temperature and hydrological processes also benefit from the enhanced estimation of SCF. The new solution also shows good portability as it also demonstrates similar enhancements when it is directly used in a global 1° simulation, even though it was trained at a 0.1° resolution.

How to cite: Zhang, H., Li, L., and Dai, Y.: A Machine Learning Based Snow Cover Parameterization  for Common Land Model (CoLM) , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6965,, 2024.

Land-atmosphere coupling (LAC) involves a variety of interactions between the land surface and the atmospheric boundary layer that are critical to are critical to understanding hydrological partitioning and cycling. As climate change continues to affect these interactions, identifying the specific drivers of LAC variability has become increasingly important. However, due to the complexity of the coupling mechanism, a quantitative understanding of the potential drivers is still lacking. Recently, deep learning has been considered as an effective approach to capture nonlinear relationships within the data, which provides a useful window into complex climatic processes. In this study, we will explore the LAC variability under climate change and its potential drivers by using Convolutional Long Short-term Memory (ConvLSTM) together with explainable AI techniques for attribution analysis. Specifically, the variability of the LAC, defined here as a two-legged index, is used as the modeling target, and variables representing meteorological forcing, land use, irrigation, soil properties, gross primary production, ecosystem respiration, and net ecosystem exchange are the inputs. Our analysis covers global land with a spatial resolution of 0.1° × 0.1° every one day during the period 1979–2019. Overall, the study demonstrates how interpretable machine learning would help us understand the complex dynamics of LAC under changing climatic conditions. We expect the results to facilitate the understanding of terrestrial hydroclimate interactions and hopefully provide multiple lines of evidence to support future water management.

How to cite: Huang, F., Shangguan, W., and Jiang, S.: Identifying potential drivers of land-atmosphere coupling variation under climate change by explainable artificial intelligence, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-7202,, 2024.

EGU24-7950 | ECS | Posters on site | HS3.5

Improving streamflow prediction across China by hydrological modelling together with machine learning 

wang jiao and zhang yongqiang

Predicting streamflow is key for water resource planning, flood and drought risk assessment, and pollution mitigation at regional, national, and global scales. There is a long-standing history of developing physically or conceptually catchment rainfall-runoff models that have been continuously refined over time to include more physical processes and enhance their spatial resolution. On the other hand, machine learning methods, particularly neural networks, have demonstrated exceptional accuracy and extrapolation capabilities in time-series prediction. Both approaches exhibit their strengths and limitations. This leads to a research question: how to effectively balance model complexity and physical interpretability while maintaining a certain level of predictive accuracy. This study aims to effectively combine a conceptual hydrological model, HBV, with machine learning (Transformer, Long Short-Term Memory (LSTM)) using a differentiable modeling framework strategy, tailored to predicting streamflow under diverse climatic and geographical conditions across China. Utilizing the Transformer to optimize and replace certain parameterization processes in the HBV model, a deep integration of neural networks and the HBV model is achieved. This integration not only captures the non-linear relationships that traditional hydrological models struggle to express, but also maintains the physical interpretability of the model. Preliminary application results show that the proposed framework outperforms traditional HBV model and pure LSTM model in streamflow prediction across 68 catchments in China. Based on the test results from different catchments, we have adjusted and optimized the model structure or parameters to better adapt to the unique hydrological processes of each catchment. The application of self-attention mechanisms and a differentiable programming framework significantly enhances the model's ability to capture spatiotemporal dynamics. It is likely that the proposed framework can be widely used for streamflow prediction somewhere else.

How to cite: jiao, W. and yongqiang, Z.: Improving streamflow prediction across China by hydrological modelling together with machine learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-7950,, 2024.

EGU24-9319 | ECS | Posters on site | HS3.5

Developing hybrid distributed models for hydrological simulation and climate change assessment in large alpine basins 

Bu Li, Ting Sun, Fuqiang Tian, and Guangheng Ni

Large alpine basins on the Tibetan Plateau (TP) provide abundant water resources crucial for hydropower generation, irrigation, and daily life. In recent decades, the TP has been significantly affected by climate change, making it crucial to understand the runoff response to climate change are essential for water resources management. While limited knowledge of specific alpine hydrological processes has constrained the accuracy of hydrological models and heightened uncertainties in climate change assessments. Recently, hybrid hydrological models have come to the forefront, synergizing the exceptional learning capacity of deep learning with a rigorous adherence to hydrological knowledge of process-based models. These models exhibit considerable promise in achieving precision in hydrological simulations and conducting climate change assessments. However, a notable limitation of existing hybrid models lies in their failure to incorporate spatial information and describe alpine hydrological processes, which restricts their applicability in hydrological modeling and climate change assessment in large alpine basins. To address this issue, we develop a set of hybrid distributed hydrological models by employing a distributed process-based model as the backbone, and utilizing embedded neural networks (ENNs) to parameterize and replace different internal modules. The proposed models are tested on three large alpine basins on the Tibetan Plateau. Results are compared to those obtained from hybrid lumped models, state-of-the-art distributed hydrological model, and DL models. A climate perturbation method is further used to evaluate the alpine basins' runoff response to climate change.Results indicate that proposed hybrid hydrological models can perform well in predicting runoff in large alpine basins. The optimal hybrid model with Nash-Sutcliffe efficiency coefficients (NSEs) higher than 0.87 shows comparable performance to state-of-the-art DL models. The hybrid distributed model also exhibits remarkable capability in simulating hydrological processes at ungauged sites within the basin, markedly surpassing traditional distributed models. Besides, runoff exhibits an amplification effect in response to precipitation changes, with a 10% precipitation change resulting in a 15–20% runoff change in large alpine basins. An increase in temperature enhances evaporation capacity and changes the redistribution of rainfall and snowfall and the timing of snowmelt, leading to a decrease in the total runoff and a reduction in the intra-annual variability of runoff. Overall, this study provides a high-performance tool enriched with explicit hydrological knowledge for hydrological prediction and improves our understanding about runoff’s response to climate change in large alpine basins on the TP. 

How to cite: Li, B., Sun, T., Tian, F., and Ni, G.: Developing hybrid distributed models for hydrological simulation and climate change assessment in large alpine basins, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-9319,, 2024.

In facing the challenges of limited observational streamflow data and climate change, accurate streamflow prediction and flood management in large-scale catchments become essential. This study introducing a time-lag informed deep learning framework to enhance streamflow simulation and flood forecasting. Using the Dulong-Irrawaddy River Basin (DIRB), a less-explored transboundary basin shared by Myanmar, China, and India, as a case study, we have identified peak flow lag days and relative flow scale. Integrating these with historical flow data, we developed an optimal model. The framework, informed by data from the upstream Hkamti sub-basin, significantly outperformed standard LSTM, achieving a Kling-Gupta Efficiency (KGE) of 0.891 and a Nash-Sutcliffe efficiency coefficient (NSE) of 0.904. Notably, the H_PFL model provides a valuable 15-day lead time for flood forecasting, enhancing emergency response preparations. The transfer learning model, incorporating meteorological inputs and catchment features, achieved an average NSE of 0.872 for streamflow prediction, surpassing the process-based model MIKE SHE's 0.655. We further analyzed the sensitivities of the deep learning model and process-based model to changes in meteorological inputs using different methods. Deep learning models exhibit complex sensitivities to these inputs, more accurately capturing non-linear relationships among multiple variables than the process-based model. Integrated Gradients (IG) analysis further demonstrates deep learning model's ability to discern spatial heterogeneity in upstream and downstream sub-basins and its adeptness in characterizing different flow regimes. This study underscores the potential of deep learning in enhancing the understanding of hydrological processes in large-scale catchments and highlights its value for water resource management in transboundary basins under data scarcity.

How to cite: Ma, K. and He, D.: Streamflow Prediction and Flood Forecasting with Time-Lag Informed Deep Learning framework in Large Transboundary Catchments, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-9980,, 2024.

EGU24-11159 | ECS | Orals | HS3.5

Uncovering the impact of hydrological connectivity on nitrate transport at the catchment scale using explainable AI 

Felipe Saavedra, Noemi Vergopolan, Andreas Musolff, Ralf Merz, Carolin Winter, and Larisa Tarasova

Nitrate contamination of water bodies is a major concern worldwide, as it poses a risk of eutrophication and biodiversity loss. Nitrate travels from agricultural land to streams through different hydrological pathways, which are abstrusely activated under different hydrological conditions. Certainly, hydrological conditions can alter the connection between different parts of the catchment and streams, in many cases independent of the discharge levels, leading to modifications in transport dynamics, retention, and nitrate removal rates in the catchment. While enhanced nitrate transport can be linked to high levels of hydrological connectivity, little is known about the effects of the spatial patterns of hydrological connectivity on the transport of nutrients at the catchment scale.

In this study, we combined daily stream nitrate concentration and discharge data at the outlet of 15 predominantly agricultural catchments in the United States (191–16,000 km2 area, 3500 km2 median area, and 77% median agriculture coverage) with soil moisture data from  SMAP-Hydroblocks (Vergopolan et al., 2021). SMAP-Hydroblocks is a hyperresolution soil moisture dataset at the top 5 cm of soil column at 30-m spatial resolution and 2-3 days revisit time (2015-2019), and it is derived through a combination of satellite data, land-surface and radiative transfer modeling, machine learning, and in-situ observations.

We configured a deep learning model for each catchment, driven by 2D soil moisture fields and 1D discharge time series, to evaluate the impact of streamflow magnitude and spatial patterns of soil moisture on streamflow nitrate concentration. The model setup comprises two parallel branches. The first branch incorporates a Long Short-term Memory (LSTM) model, the current state-of-the-art for time-series data modeling, utilizing daily discharge as input data. The second branch contains a Convolutional LSTM network (ConvLSTM) that incorporates daily soil moisture series, the fraction of agriculture of each pixel, and the height above the nearest drainage as a measurement of structural hydrological connectivity. Finally, a fully connected neural network combines the outputs of the two branches to predict the time series of nitrate concentration at the catchment outlet.

Preliminary results indicate that the model performs satisfactorily in one-third of the catchments, with Nash-Sutcliffe Efficiency (NSE) values above 0.3 for the test period, which covers the final 25% of the time series, and this is achieved without tuning the hyperparameters. The model failed to simulate nitrate concentrations (resulting in negative NSE values) typically in larger catchments. Using these simulations and explainable AI, we will quantify the importance of different inputs, in particular, we tested the relative importance of soil moisture for simulating nitrate concentrations. While the literature shows most of the predictive power for nitrate comes from streamflow rates, we show how soil moisture fields add value to the prediction and understanding of hydrologic connectivity. Finally, we will fine-tune the model for each catchment and include more predictors to enhance the reliability of model simulations.

How to cite: Saavedra, F., Vergopolan, N., Musolff, A., Merz, R., Winter, C., and Tarasova, L.: Uncovering the impact of hydrological connectivity on nitrate transport at the catchment scale using explainable AI, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11159,, 2024.

EGU24-11778 | ECS | Orals | HS3.5

How much data is needed for hydrological modeling?  

Bjarte Beil-Myhre, Bernt Viggo Matheussen, and Rajeev Shrestha

Hydrological modeling has undergone a transformative decade, primarily catalyzed by the groundbreaking data-driven approach introduced by F. Kratzert et al. (2018) utilizing LSTM networks (Hochreiter & Schmidhuber, 1997). These networks leverage extensive datasets and intricate model structures, outshining traditional hydrological models, albeit with the caveat of being computationally intensive during training. This prompts a critical inquiry into the requisite volume and complexity of data for constructing a dependable and resilient hydrological model.

In this study, we employ a hybrid model that amalgamates the strengths of classical hydrological models with the data-driven approach. These modified models are derived from the LSTM models developed by F. Kratzert and team, in conjunction with classical hydrological models such as the Statkraft Hydrology Forecasting Toolbox (SHyFT) from Statkraft and the Distributed Regression Hydrological Model (DRM) by Matheussen at Å Energi. The models were applied to sixty-five catchments in southern Norway, each characterized by diverse features and data records. Our analysis assesses the performance of these models under various scenarios of data availability, considering factors such as:

- Varying numbers of catchments selected based on size or location.
- The duration of the data records utilized for model calibration.
- Specific catchment characteristics and outputs from classical models employed as inputs 
(e.g., area, latitude, longitude, or additional variables).

Preliminary findings indicate that model inputs can be significantly stripped down without compromising model performance. With a limited set of catchment characteristics, the performance approaches that of the model with all characteristics, mitigating added uncertainty and model complexity. Additionally, increasing the length of data records enhances model performance, albeit with diminishing returns. Furthermore, our study reveals that augmenting catchments in the model does not necessarily yield a commensurate improvement in overall model performance. These insights contribute to refining our understanding of the interplay between data, model complexity, and performance in hydrological modeling.

The novelty in this research is that the hybrid models can be applied in a relatively small area, with few catchments and a limited number of climate stations and catchment characteristics compared to the CAMELS setup, used by Kratzert and still achieve improved results. 

How to cite: Beil-Myhre, B., Matheussen, B. V., and Shrestha, R.: How much data is needed for hydrological modeling? , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11778,, 2024.

EGU24-12068 | ECS | Orals | HS3.5

Hybrid Neural Hydrology: Integrating Physical and Machine Learning Models for Enhanced Predictions in Ungauged Basins 

Rajeev Shrestha, Bjarte Beil-Myhre, and Bernt Viggo Matheussen

Accurate prediction of streamflow in ungauged basins is a fundamental challenge in hydrology. The lack of hydrological observations and the inherent complexities in ungauged regions hinder accurate predictions, posing significant hurdles for water resource management and forecasting. Over time, efforts have been made to tackle this predicament, primarily utilizing physical hydrological models. However, these models need to be revised due to their reliance on site-specific data and their struggle to capture complex nonlinear relationships. Recent work by Kratzert et al. (2018) suggests that nonlinear regression models such as LSTM neural networks (Hochreiter & Schmidhuber, 1997) may outperform traditional physically based models. The authors demonstrate the application of LSTM models to ungauged prediction problems, noting that information about physical processes might not have been fully utilized in the modeling setup.

In response to these challenges, this research explores a novel approach by introducing a Hybrid Neural Hydrology (HNH) approach by fusing the strengths of physical hydrological models like Statkraft Hydrology Forecasting Toolbox (SHyFT), developed at Statkraft and the Distributed Regression Hydrological Model (DRM), developed by Matheussen at Å Energi with machine learning model, specifically Neural Hydrology, developed by F. Kratzert and team. By combining the information and structural insights of physically based models with the flexibility and adaptability of machine learning models, HNH seeks to leverage the complementary attributes of these methodologies. The combination is achieved by fusing the uncalibrated physical model with an LSTM based model. This hybridization seeks to enhance the model's adaptability and learning capabilities, leveraging available information from various sources to improve predictions in ungauged areas. Furthermore, this research investigates the impact of clustering catchments based on area to improve model performance.

The data used in this research includes dynamic variables such as precipitation, air temperature, wind speed, relative humidity, and observed streamflow obtained from sources such as the internal database at Å Energi, The Norwegian Water Resources and Energy Directorate (NVE), The Norwegian Meteorological Institute (MET), ECMWF (ERA5) and static attributes such as catchment size, mean elevation, forest fraction, lake fraction and reservoir fraction obtained from CORINE Land Cover and Høydedata (

This study presents HNH as a novel approach that synergistically integrates the structural insights of physical models with the adaptability of machine learning. Preliminary findings indicate promising outcomes from testing in 65 catchments in southern Norway. This suggests that information about physical processes and clustering catchments based on their similarities significantly improves the prediction quality in ungauged regions. This discovery underscores the potential of using hybrid models and clustering techniques to enhance the performance of predictive models in ungauged basins.

How to cite: Shrestha, R., Beil-Myhre, B., and Matheussen, B. V.: Hybrid Neural Hydrology: Integrating Physical and Machine Learning Models for Enhanced Predictions in Ungauged Basins, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12068,, 2024.

EGU24-12574 | ECS | Orals | HS3.5 | Highlight

Analyzing the performance and interpretability of hybrid hydrological models 

Eduardo Acuna, Ralf Loritz, Manuel Alvarez, Frederik Kratzert, Daniel Klotz, Martin Gauch, Nicole Bauerle, and Uwe Ehret

Hydrological hybrid models have been proposed as an option to combine the enhanced performance of deep learning methods with the interpretability of process-based models. Among the various hybrid methods available, the dynamic parameterization of conceptual models using LSTM networks has shown high potential. 

In this contribution, we extend our previous related work (Acuna Espinoza et al., 2023) by asking the questions: How well can hybrid models predict untrained variables, and how well do they generalize? We address the first question by comparing the internal states of the model against external data, specifically against soil moisture data obtained from ERA5-Land for 60 basins in Great Britain. We show that the process-based layer can reproduce the soil moisture dynamics with a correlation of 0.83, which indicates a good ability of this type of model to predict untrained variables. Moreover, we compare this method against existing alternatives used to extract non-target variables from purely data-driven methods (Lees et al., 2022), and discuss the differences in philosophy, performance, and implementation. Then, we address the second question by evaluating the capacity of such models to predict extreme events. Following the procedure proposed by Frame et al (2022), we train the hybrid models in low-flow regimes and test them in high-flow situations to evaluate the generalization capacity of such models and compare them against results from purely data-driven methods. Both experiments are done using large-sample data from the CAMELS-US and CAMELS-GB dataset.

With these new experiments, we contribute to answering the question of whether hybrid models give an actual advantage over purely data-driven techniques or not.


Acuna Espinoza, E., Loritz, R., Alvarez Chaves, M., Bäuerle, N., & Ehret, U.: To bucket or not to bucket? Analyzing the performance and interpretability of hybrid hydrological models with dynamic parameterization. EGUsphere, 1–22., 2023.

Frame, J. M. and Kratzert, F. and Klotz, D. and Gauch, M. and Shalev, G. and Gilon, O. and Qualls, L. M. and Gupta, H. V. and Nearing, G. S., :Deep learning rainfall--runoff predictions of extreme events, Hydrology and Earth System Sciences, 26 ,3377-3392,, 2022

Lees, T., Reece, S., Kratzert, F., Klotz, D., Gauch, M., De Bruijn, J., Kumar Sahu, R., Greve, P., Slater, L., and Dadson, S. J.: Hydrological concept formation inside long short-term memory (LSTM) networks, Hydrology and Earth System Sciences, 26, 3079–3101,,  2022.

How to cite: Acuna, E., Loritz, R., Alvarez, M., Kratzert, F., Klotz, D., Gauch, M., Bauerle, N., and Ehret, U.: Analyzing the performance and interpretability of hybrid hydrological models, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12574,, 2024.

EGU24-12981 | ECS | Orals | HS3.5

Using Temporal Fusion Transformer (TFT) to enhance sub-seasonal drought predictions in the European Alps 

Annie Yuan-Yuan Chang, Konrad Bogner, Maria-Helena Ramos, Shaun Harrigan, Daniela I.V. Domeisen, and Massimiliano Zappa

In recent years, the European Alpine space has witnessed unprecedented low-flow conditions and drought events, affecting various economic sectors reliant on sufficient water availability, including hydropower production, navigation and transportation, agriculture, and tourism. As a result, there is an increasing need for decision-makers to have early warnings tailored to local low-flow conditions.

The EU Copernicus Emergency Management Service (CEMS) European Flood Awareness System (EFAS) has been instrumental in providing flood risk assessments across Europe with up to 15 days of lead time since 2012. Expanding its capabilities, the EFAS also generates long-range hydrological outlooks from sub-seasonal to seasonal horizons. Despite its original flood-centric design, previous investigations have revealed EFAS’s potential for simulating low-flow events. Building upon this finding, this study aims to leverage EFAS's anticipation capability to enhance the predictability of drought events in Alpine catchments, while providing support to trans-national operational services.

In this study, we integrate the 46-day extended-range EFAS forecasts into a hybrid setup for 106 catchments in the European Alps. Many studies have demonstrated Long Short-Term Memory (LSTM)’s capacity to produce skillful hydrological forecasts at various time scales. Here we employ the deep learning algorithm Temporal Fusion Transformer (TFT), an algorithm that combines aspects of LSTM networks with the Transformer architecture. The Transformer's attention mechanisms can focus on relevant time steps across longer sequences enabling TFT to capture both local temporal patterns as well as global dependencies. The role of the TFT is to improve the accuracy of low-flow predictions and to understand their spatio-temporal evolution. In addition to EFAS data, we incorporate features such as European weather regime data, streamflow climatology, and hydropower proxies. We also consider catchment characteristic information including glacier coverage and lake proximity. By incorporating its various attention mechanisms, makes TFT a more explainable algorithm than LSTMs, which helps us understand the driving factor for the forecast skill. Our evaluation uses EFAS re-forecast data as the benchmark and measures the reliability of ensemble forecasts using metrics like the Continuous Ranked Probability Skill Score (CRPSS).

Preliminary results show that a hybrid approach using the TFT algorithm can reduce the flashiness of EFAS during drought periods in some catchments, thereby improving drought predictability. Our findings will contribute to evaluating the potential of these forecasts for providing valuable information for skillful early warnings and assist in informing regional and local water resource management efforts in their decision-making.

How to cite: Chang, A. Y.-Y., Bogner, K., Ramos, M.-H., Harrigan, S., Domeisen, D. I. V., and Zappa, M.: Using Temporal Fusion Transformer (TFT) to enhance sub-seasonal drought predictions in the European Alps, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12981,, 2024.

EGU24-13417 | ECS | Orals | HS3.5

Evaluating physics-based representations of hydrological systems through hybrid models and information theory 

Manuel Álvarez Chaves, Eduardo Acuña Espinoza, Uwe Ehret, and Anneli Guthke

Hydrological models play a crucial role in understanding and predicting streamflow. Recently, hybrid models, combining both physical principles and data-driven approaches, have emerged as promising tools to extract insights into system functioning and increases in model predictive skill which are beyond traditional models.

However, the study by Acuña Espinoza et al. (2023) has raised the question whether the flexible data-driven component in a hybrid model might "overwrite" the interpretability of its physics-based counterpart. On the example of conceptual hydrological models with dynamic parameters tuned by LSTM networks, they showed that even in a case where the physics-based representation of the hydrological system is chosen to be nonsensical on purpose, the addition of the flexible data-driven component can lead to a well-performing hybrid model. This compensatory behavior highlights the need for a thorough evaluation of physics-based representations in hybrid hydrological models, i.e., hybrid models should be inspected carefully to understand why and how they predict (so well).

In this work, we provide a method to support this inspection: we objectively assess and quantify the contribution of the data-driven component to the overall hybrid model performance. Using information theory and the UNITE toolbox (, we measure the entropy of the (hidden) state-space in which the data-driven component of the hybrid model moves. High entropy in this setting means that the LSTM is doing a lot of "compensatory work", and hence alludes to an inadequate representation of the hydrological system in the physics-based component of the hybrid model. By comparing this measure among a set of alternative hybrid models with different physics-based representations, an order in the degree of realism of the considered representations can be established. This is very helpful for model evaluation and improvement as well as system understanding.

To illustrate our findings, we present examples from a synthetic case study where a true model does exist. Subsequently, we validate our approach in the context of regional predictions using CAMELS-GB data. This analysis highlights the importance of using diverse representations within hybrid models to ensure the pursuit of "the right answers for the right reasons". Ultimately, our work seeks to contribute to the advancement of hybrid modeling strategies that yield reliable and physically reasonable insights into hydrological systems.


  • Acuña Espinoza, E., Loritz, R., Álvarez Chaves, M., Bäuerle, N., & Ehret, U. (2023). To bucket or not to bucket? analyzing the performance and interpretability of hybrid hydrological models with dynamic parameterization. EGUsphere, 1–22.

How to cite: Álvarez Chaves, M., Acuña Espinoza, E., Ehret, U., and Guthke, A.: Evaluating physics-based representations of hydrological systems through hybrid models and information theory, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13417,, 2024.