Presentation type:
ESSI – Earth & Space Science Informatics

EGU26-5340 | ECS | Posters virtual | VPS21

A Multi-Objective Cost Minimization Framework for Managed Aquifer Recharge Integrating Pareto Optimization and Least-Cost Path Analysis 

Rahma Fri, Andrea Scozzari, Souad Haida, Malika Kili, Jamal Chao, Abdelaziz Mridekh, and Bouabid El Mansouri

In arid and semi-arid regions, pressure on groundwater resources has reached critical levels. Long-term over-pumping has depleted many aquifers, and climate change is intensifying this process. Rising temperatures increase evaporation from rivers and reservoirs, reducing the amount of surface water available for infiltration and natural recharge. Under these conditions, the use of surface water during periods of availability and its storage underground represents a key mechanism of managed aquifer recharge, effectively avoiding evaporation losses.

In this study, a practical framework is developed and tested to identify feasible ways to transfer accumulated surface water toward stressed aquifers. Rather than relying on complex ranking approaches, the locations of existing water infrastructure specifically wells and traditional khettara systems are used as reference points. These features indicate where aquifers are accessible and provide realistic spatial anchors for planning recharge at the regional scale.

The method combines satellite imagery to map surface water, geographic information systems (GIS) to identify cost-effective transfer pathways across the landscape, and multi-objective optimization to evaluate trade-offs between competing objectives. Feasibility is assessed through a cost function that accounts for terrain slope, elevation differences, transfer distance, pumping energy requirements, infrastructure costs, and potential water treatment needs.

The approach is applied to the Draa Oued Noun Basin in southern Morocco, a region strongly affected by water scarcity, high evaporation rates, and declining groundwater levels. Several surface water sources are examined, and feasible conveyance routes toward aquifers supplying key wells and khettara systems are identified.

The results show substantial variations in cost between water sources. Available water volume, transfer distance, and especially elevation lift emerge as the main cost drivers. Trade-off analysis helps identify the most cost-effective projects under limited budgets. The results also highlight opportunities for cost reduction: where gravity-driven transfer is possible, costs are significantly lower, and where pumping is required, solar energy offers a viable option for reducing long-term operational expenses.

Overall, this work provides a spatially explicit and realistic basis for planning artificial groundwater recharge, while respecting economic constraints and supporting sustainable groundwater management in highly water-stressed regions.

 

 

How to cite: Fri, R., Scozzari, A., Haida, S., Kili, M., Chao, J., Mridekh, A., and El Mansouri, B.: A Multi-Objective Cost Minimization Framework for Managed Aquifer Recharge Integrating Pareto Optimization and Least-Cost Path Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5340, https://doi.org/10.5194/egusphere-egu26-5340, 2026.

EGU26-11154 | ECS | Posters virtual | VPS21

Choosing an I/O approach for Earth system models: lessons learned from a modular I/O server for MESSy 

Aleksandar Mitic, Patrick Jöckel, Astrid Kerkweg, Kerstin Hartung, Bastian Kern, and Moritz Hanke

Modern Earth system models increasingly hit I/O limits—not only in performance, but also in reproducibility, maintainability, and developer productivity. As data volumes and workflows evolve, tightly coupled, file-centric I/O approaches can become hard to scale and hard to extend.

We present the design and lessons learned from introducing an asynchronous, modular I/O server concept in the Modular Earth Submodel System (MESSy). I/O operations were decoupled from the Fortran-based scientific core and implemented as separate Python services, where the communication between the two components was implemented using the Yet Another Coupler (YAC) library. This architecture was chosen to improve flexibility and long-term maintainability, while enabling heterogeneous workflows and evolving storage backends.

Using MESSy as a case study, we discuss practical decision criteria for selecting an I/O concept in large models (e.g., scaling behavior, accessibility for developers, testing and CI strategies, and reproducibility).  We conclude with lessons learned from bridging Fortran and Python communities and from lowering entry barriers for user-developers in a large modeling system.

How to cite: Mitic, A., Jöckel, P., Kerkweg, A., Hartung, K., Kern, B., and Hanke, M.: Choosing an I/O approach for Earth system models: lessons learned from a modular I/O server for MESSy, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11154, https://doi.org/10.5194/egusphere-egu26-11154, 2026.

EGU26-13270 | Posters virtual | VPS21

Integrating Participatory Perception-Mapping Data and Stochastic Image Analysis for Urban Landscape Assessment 

Stavroula Kopelia, Nikos Tepetidis, Julia Nerantzia Tzortzi, G.-Fivos Sargentis, and Romanos Ioannidis

Modern digital technologies and geoinformatics have experienced rapid growth, offering powerful tools to bridge the gap between scientific communities and society in landscape assessment and mapping. This research details the application of a crowdsourcing scheme that utilizes a dedicated mobile application to facilitate direct public participation in quantifying perceptions of urban landscapes and architecture. Initially developed as an educational tool, the methodology has been tested by university students across Italy, Greece, and France, providing a foundational phase for assessing landscape quality and urban typologies. Building upon these educational pilot studies, the work explores the evolution of this methodology into a broader, multicultural citizen science initiative designed to improve the quality and quantity of available landscape perception data.

A significant technical advancement in this research involves the integration of automated image analysis to process the novel data generated by participants from any location. The photographic material was examined using stochastic image analysis based on climacograms, in which images are treated as two-dimensional grayscale intensity fields and analyzed across multiple spatial scales. The method enables the comparison of image patterns based on the visual complexity of the uploaded photographs. A primary challenge addressed was the algorithm's performance when processing real-world, non-curated smartphone images. The analysis began an assessment on how the methodology handles environmental noise, such as sky, trees, and unconventional capture angles, which are inherent to bottom-up crowdsourcing schemes.

The early results indicate that the method can reveal group-level tendencies associated with differing architectural characteristics, particularly in relation to visual complexity, while not supporting reliable classification at the level of individual image. In detail, the findings indicate a trend towards two categorizations: firstly, between modernist-type movements, characterized by minimal elements, and secondly between eclectic or decorative movements, which exhibited higher measured complexity; however, this this behaviour was not observed universally on all analyzed movements The stochastic analysis also indicated theoretical overlaps between certain movements, such as Postmodernism and Eclecticism, based on shared decorative patterns. While the results highlight that environmental factors can influence the analysis of individual photographs, the method utilized presents potential for distinguishing movement trends with logical consistency even from unfiltered data.

Scientifically, this yield of quantitative data sets the groundwork for improved research in the humanities and culture, showing a strong correlation with established landscape quality indices. Socially, the project provides a scalable model for participatory mapping that fosters critical thinking about urban quality, creating new conditions for communication between universities and the broader public. Overall, the presented work reports on the early-stage results of this methodological exploration and aims to evaluate the combined use of participatory mobile data collection and exploratory image-based analysis for landscape and architectural studies, while identifying key challenges related to data quality, interpretation, and future methodological refinement.

How to cite: Kopelia, S., Tepetidis, N., Tzortzi, J. N., Sargentis, G.-F., and Ioannidis, R.: Integrating Participatory Perception-Mapping Data and Stochastic Image Analysis for Urban Landscape Assessment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13270, https://doi.org/10.5194/egusphere-egu26-13270, 2026.

EGU26-13783 | Posters virtual | VPS21

Monitoring Land Cover Dynamics in Bahr Qarun District, Egypt, via Remote Sensing Data  

Abdelrahman Elsehsah, Abdelazim Negm, Eid Ashour, and Mohammed Elsahabi

Accurate monitoring of land cover is essential for sustainable environmental management and urban planning in arid regions. However, rapid changes in land use often make it difficult to distinguish between different surface types, such as urban areas and bare soil, using standard satellite data alone. This research examines land-use changes in the Bahr Qarun district of Fayoum, Egypt, during 2019, 2021, and 2023. The study used Sentinel-2 and Landsat OLI 8 satellite images taken each April to ensure data consistency. We applied the Maximum Likelihood (ML) method to classify Sentinel-2 images. They used 30 training samples for each land category to guide the process. The results achieved a Kappa coefficient above 75%, indicating a reliable level of accuracy. We measured vegetation using the Normalized Difference Vegetation Index (NDVI) and urban areas using the Normalized Difference Built-up Index (NDBI). A comparative analysis revealed that NDVI results were closely aligned with those obtained from supervised classification, reflecting its strong capability in accurately identifying vegetated areas. In contrast, NDBI exhibited a tendency to overestimate urban extent, primarily due to spectral confusion between built-up surfaces and bare soil within individual pixels. The study concludes that NDVI is an effective tool for mapping the green cover in this area.

Keywords: Land Cover Change, Sentinel-2, Landsat OLI 8, Supervised Classification,  Spectral Indices (NDVI & NDBI), Bahr Qarun, Egypt.

How to cite: Elsehsah, A., Negm, A., Ashour, E., and Elsahabi, M.: Monitoring Land Cover Dynamics in Bahr Qarun District, Egypt, via Remote Sensing Data , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13783, https://doi.org/10.5194/egusphere-egu26-13783, 2026.

EGU26-13852 | ECS | Posters virtual | VPS21

Monitoring Shallow Water Depths: A Review of Satellite-Derived Bathymetry Methods 

Mohamed H. Abdalla, Hassan Elhalawany, Saad M. Abdelrahman, Abdelazim Negm, and Andrea Scozzari

Satellite-Derived Bathymetry (SDB) offers a cost-effective alternative to traditional shipborne surveys for mapping large coastal areas. This technique utilizes optical remote sensing data from multispectral sensors to estimate water depth. The fundamental principle relies on the behavior of light as it travels through the water column; as depth increases, light intensity decreases due to absorption and scattering. Different wavelengths penetrate to varying degrees, with blue light reaching the greatest depths while red light is absorbed quickly. By analyzing these spectral features, researchers can calculate underwater topography. Currently, SDB techniques are categorized into two primary groups: physically based (analytical) models, which simulate light propagation without needing local in-situ depth calibration, and statistical (empirical) models, which correlate satellite data with known depth measurements from nautical charts, ship-based acoustic surveys or airborne LiDAR.

While both approaches provide extensive spatial coverage at a lower cost, they are generally limited to clear, shallow waters, typically reaching depths of less than 20 meters. Analytical models are highly accurate but complex and data-intensive, whereas empirical models are more accessible but rely heavily on the quality of reference data. Recent advancements in machine learning have significantly improved the automation and performance of these empirical methods. This study evaluates the core concepts, advantages, and limitations of various SDB approaches, with a focus on Landsat-8 and Sentinel-2 data. Furthermore, the research details essential processes for empirical model calibration, validation, and detecting model bias. The findings emphasize that rigorous evaluation and bias correction are critical for ensuring the reliability of depth data in diverse coastal environments.

Keywords: Satellite-Derived Bathymetry, Remote Sensing, Empirical Models, Stumpf Algorithm, Coastal Waters, Model Bias Detection and Correction.

How to cite: Abdalla, M. H., Elhalawany, H., Abdelrahman, S. M., Negm, A., and Scozzari, A.: Monitoring Shallow Water Depths: A Review of Satellite-Derived Bathymetry Methods, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13852, https://doi.org/10.5194/egusphere-egu26-13852, 2026.

EGU26-19784 | Posters virtual | VPS21

Operationalising Semantic Interoperability for Cross-domain Discovery with LUMIS 

Julien Homo, Christelle Pierkot, Kévin Darty, and Hakim Allem

Significant heterogeneity in metadata schemas, vocabularies, and ontologies hinders the discovery, reuse, and integration of European environmental data infrastructures across national and disciplinary boundaries. Recent initiatives have identified semantic interoperability as a vital enabler of FAIR data flows between infrastructures, paving the way for sophisticated, AI-driven, large-scale analyses.

Powered by OntoPortal technology, EarthPortal is a specialised catalogue of semantic resources (ontologies, thesauri and controlled vocabularies) for Earth and environmental sciences. It provides navigation, multi-ontology searching, mapping management, text annotation and recommendation services via web interfaces and REST APIs. These support data catalogues and repositories in an interoperable way.

EOSC LUMEN builds an interoperable discovery ecosystem across multiple domains (including Earth System Science, Social Sciences and Humanities, and Mathematics) to enable cross-platform search and meaningful reuse across communities. Rather than focusing only on metadata aggregation, LUMEN targets the practical enablers of interoperability that make resources discoverable and machine-actionable across infrastructures.

LUMIS (LUMEN Infrastructure for Semantics) is the shared semantic layer of LUMEN. It supports the end-to-end lifecycle of semantic artefacts (ontologies and controlled vocabularies, including SKOS resources) from scoping and requirements to implementation, publication and long-term maintenance. LUMIS focuses on governance, provenance, versioning and quality checks, while adopting an integration-first strategy: it connects and orchestrates established community tools (deployed services and/or API-based components) into coherent workflows, so that semantic resources can be created, aligned, validated and delivered in reusable forms for discovery platforms.

Integrating EarthPortal into LUMIS links a domain-specific semantic catalogue to a cross-domain discovery ecosystem. This enables repositories to annotate metadata using EarthPortal resources, while making use of LUMIS’s lifecycle-driven workflows and FAIR-aligned governance and quality checks.

In this presentation, we will demonstrate how integrating EarthPortal into the LUMIS platform supports more consistent semantic interoperability and FAIR-aligned practices across European Earth System Science infrastructures. We will showcase practical data workflows to enhance interdisciplinary research.

How to cite: Homo, J., Pierkot, C., Darty, K., and Allem, H.: Operationalising Semantic Interoperability for Cross-domain Discovery with LUMIS, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19784, https://doi.org/10.5194/egusphere-egu26-19784, 2026.

EGU26-20391 | Posters virtual | VPS21

A Scalable, FAIR‑Aligned Data Lake Architecture for Earth System Modelling: From Heterogeneous Raw Archives to Curated, Metadata‑Rich, Analysis‑Ready Climate Data 

Bushra Amin, Jakob Zscheischler, Luis Samaniego, Jian Peng, Almudena García-García, and Toni Harzendorf

Modern Earth system research relies on integrating heterogeneous datasets such as reanalysis, satellite observations, in situ measurements, climate model ensembles, and reforecasts, yet these data are often stored in fragmented, inconsistent, and difficult to reuse forms. This limits reproducibility, slows modelling workflows, and constrains the development of operational digital twins for water and climate risk management.

This contribution presents a scalable, FAIR aligned data lake architecture implemented on the EVE high performance computing environment. The system transforms a large, unstructured source pool of more than two million files into a curated, duplication free, metadata rich repository designed for hydrological modelling, machine learning, and climate analytics. The architecture follows a four stage lifecycle: raw, curated, database ready, and ancillary GIS layers, reflecting data governance practices used by major climate centres.

A reproducible ingestion workflow classifies, deduplicates, and standardizes datasets from ERA5, ERA5 Land, MERRA 2, PRISM, E OBS, GPM IMERG, CMIP6, ISIMIP3, ECMWF reforecasts, MODIS, CHIRPS, GFED, GRDC, GSIM, and other sources. A Python based metadata extractor, built on CF convention standards, automatically captures variables, units, dimensions, spatial resolution, temporal coverage, coordinate reference systems, and checksums. Metadata are stored both as dataset level JSON and as a global inventory, enabling transparent provenance tracking and rapid dataset discovery.

The curated data hub is implemented under /data/db/earth_system and organized by scientific domain, temporal resolution, spatial extent, and processing stage. The system supports SLURM based workflows, HPC native processing, and cloud optimized formats such as Zarr.

This work demonstrates how a single researcher can design and operationalize a modern, HPC native data infrastructure that accelerates hydro climate research and forms the backbone of an emerging Digital Hydro Twin. The approach is transferable to institutions seeking to modernize their data ecosystems and improve reproducibility in environmental modelling.

How to cite: Amin, B., Zscheischler, J., Samaniego, L., Peng, J., García-García, A., and Harzendorf, T.: A Scalable, FAIR‑Aligned Data Lake Architecture for Earth System Modelling: From Heterogeneous Raw Archives to Curated, Metadata‑Rich, Analysis‑Ready Climate Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20391, https://doi.org/10.5194/egusphere-egu26-20391, 2026.

EGU26-21793 | Posters virtual | VPS21

Hydrological Modelling of the Upper Senegal River Basin Using SWAT: Assessing the Impact of Multi-Source Precipitation Data on Model Performance 

Sidi Mohamed Boussabou, Soufiane Taia, Bouabid El Mansouri, Aminetou Kebd, Abdallahi Mohamedou Idriss, Hamza Legsabi, and Lamia Erraioui

The Upper Senegal River Basin is a strategic water resource system supporting agriculture, hydropower generation, and essential ecosystem services in West Africa. However, a comprehensive understanding of its hydrological dynamics remains constrained by the limited availability of in situ hydroclimatic observations. This study applies the Soil and Water Assessment Tool (SWAT) to simulate hydrological processes in the basin, with a particular emphasis on the influence of precipitation data sources on model performance and uncertainty. Hydrological simulations were conducted at six representative gauging stations (Bakel, Kayes, Gourbassy, Oualia, Bafing Makana, and Daka Saidou) over the period 1983–2021, using a combination of ground-based observations, satellite precipitation products, and reanalysis datasets (ERA5, MERRA-2, PERSIANN, and CHIRPS). Model calibration demonstrated satisfactory performance, with Nash–Sutcliffe Efficiency (NSE) values reaching up to 0.74 at upstream stations, while reduced performance was observed downstream. Validation results showed a moderate decline in model efficiency, highlighting the sensitivity of SWAT outputs to precipitation inputs and data uncertainty. The comparative analysis of precipitation datasets reveals substantial variability in simulated streamflow and water balance components, underscoring the importance of precipitation data selection in data-scarce regions. These findings highlight the need for robust, multi-source hydroclimatic data integration to improve hydrological modelling reliability and support informed water resource management decisions.

Keywords: Upper Senegal River, SWAT, Hydrological modelling, Precipitation uncertainty; Satellite rainfall; Reanalysis data.

How to cite: Boussabou, S. M., Taia, S., El Mansouri, B., Kebd, A., Mohamedou Idriss, A., Legsabi, H., and Erraioui, L.: Hydrological Modelling of the Upper Senegal River Basin Using SWAT: Assessing the Impact of Multi-Source Precipitation Data on Model Performance, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21793, https://doi.org/10.5194/egusphere-egu26-21793, 2026.

EGU26-21965 | Posters virtual | VPS21

An EOSC Node Ireland Pilot Study: Bridging European and National e-Infrastructures for Reproducible Sentinel-2 Data Ingestion in Quarry Applications 

Flaithri Neff, Roberto Sabatino, Alfredo Arreba, and Jerry Sweeney

The establishment of the European Open Science Cloud (EOSC) places renewed emphasis on the role of national e-infrastructures in enabling standards-based, interoperable, and reusable research workflows in the EU. Within the context of Ireland’s EOSC Node, there is particular interest in demonstrating how European-scale open-data services can be digested by national research clouds, transformed into analysis-ready assets, and made available for both open research and applied industry use-cases. Earth Observation (EO) provides a strong test case, given the volume and complexity of the data involved, and its growing role in scalable environments that support operational decision-making.

This pilot project, QuarryLink, presents a Phase-1 study focused on building a reproducible EO data ingestion workflow that connects the Copernicus Data Space Ecosystem with the HEAnet Research Cloud, operating on the SURF Research Cloud platform. Through a real-world quarry case-study in the Dublin region (Ireland), the work demonstrates how EOSC-aligned principles, including auditable machine-readable workflows, can be applied from the outset of the EO research process. We will demonstrate how precise spatial boundaries can be defined and validated; how modern OAuth-based authentication mechanisms can be integrated into research cloud workflows; and how Sentinel-2 Level-2A products can be programmatically discovered, retrieved, and prepared for downstream analysis using current Copernicus services.

By executing the ingestion workflow on the HEAnet Research Cloud using open-source geospatial tooling, the pilot aims to establish an analytics-ready foundation for working with Sentinel-2 data in a reproducible research cloud environment. The resulting data products are structured to support downstream analysis, with compute resources accessed dynamically through the HEAnet Research Cloud workspace as required. Building on this foundation, Phase 2 will focus on developing time-series analyses, EO data cubes, and derived environmental indicators to support both research-driven investigation and applied monitoring scenarios in European quarry environments.

More broadly, the pilot seeks to illustrate how EOSC-aligned integration across data ingestion and compute layers can support open research practices while enabling scalable, real-world EO-enabled industrial applications, providing a practical pathway for national EOSC Nodes to translate open data into shareable analytics and societal impact.

How to cite: Neff, F., Sabatino, R., Arreba, A., and Sweeney, J.: An EOSC Node Ireland Pilot Study: Bridging European and National e-Infrastructures for Reproducible Sentinel-2 Data Ingestion in Quarry Applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21965, https://doi.org/10.5194/egusphere-egu26-21965, 2026.

EGU26-22084 | ECS | Posters virtual | VPS21

Monitoring Groundwater Quality and Improvement in the Kima Area, Aswan 

Marwa Khairy, Ahmed S. Nour-Eldeen, Hickmat Hossen, Ismail Abd-Elaty, and Abdelazim Negm

Groundwater in arid regions is highly sensitive to human activity, especially when untreated wastewater interacts with shallow aquifers. This study evaluates the hydrogeochemical response of the Kima aquifer in Aswan, Egypt, following the Kima Drain Covering Project. The research uses an integrated framework of field measurements, geospatial analysis, and multi-criteria decision-making. The team analyzed groundwater samples from 2020 and 2025. They tested eleven physicochemical parameters and six irrigation indices. Spatial interpolation through Inverse Distance Weighting (IDW) helped map temporal variations and identify contamination hotspots. To classify water suitability, the study standardized values according to WHO and Egyptian guidelines. The Analytical Hierarchy Process (AHP) was used to determine the importance of various drinking and irrigation indicators. Finally, a Weighted Linear Combination (WLC) generated composite Groundwater Quality Index (GWQI) maps. The results show a significant improvement in groundwater quality after the drain was covered. Levels of TDS, chloride, sulfate, sodium, and magnesium decreased substantially across the area. The ionic balance shifted toward a more favorable calcium-magnesium-bicarbonate facies. Irrigation indices also improved, with most parameters falling into safe or excellent ranges. The 2025 GWQI map reveals a transition from "good–permissible" to "excellent–safe" zones. This confirms that eliminating direct seepage from the drain had a positive environmental impact. This integrated AHP–GIS–IDW approach is an effective tool for monitoring groundwater changes. It provides a robust decision-support system for managing water resources in arid urban environments.

How to cite: Khairy, M., S. Nour-Eldeen, A., Hossen, H., Abd-Elaty, I., and Negm, A.: Monitoring Groundwater Quality and Improvement in the Kima Area, Aswan, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22084, https://doi.org/10.5194/egusphere-egu26-22084, 2026.

EGU26-3080 | ECS | Posters virtual | VPS22

Multibranch Adaptive Feature Fusion for Hyperspectral Image Classification 

Chen Li and Baoyu Du

Hyperspectral image (HSI) classification often struggles with feature interference across different scales and the inherent challenges of data imbalance and sample scarcity. While deep learning models have significantly advanced the field, traditional single-branch architectures often suffer from scale-related noise, where features from different receptive fields interfere with one another. To address this, we propose the Multibranch Adaptive Feature Fusion Network (MBAFFN). Our approach utilizes three parallel branches to independently extract scale-specific features, effectively decoupling the multiscale information to prevent interference. This architecture is enhanced by two specialized modules: Global Detail Attention (GDA) for capturing broad contextual dependencies and Distance Suppression Attention (DSA) for refining local pixel-level discrimination. Furthermore, a pixel-wise adaptive fusion mechanism is introduced to dynamically weigh and integrate these features, prioritizing the most relevant scales for final classification. The performance of MBAFFN was validated on four benchmark datasets: Indian Pines (IP), Pavia University (PU), Longkou (LK), and Hanchuan (HC). Compared to current state-of-the-art methods, our model improved Overall Accuracy (OA) by 0.91%, 1.71%, 0.86%, and 3.16% on the IP, PU, LK, and HC datasets, respectively. The significant improvement on the HC and PU datasets underscores the model’s robustness in scenarios with limited training samples and complex class distributions. These results, supported by detailed ablation studies, demonstrate that adaptive fusion and scale-specific branching are effective strategies for mitigating feature interference in hyperspectral analysis.

How to cite: Li, C. and Du, B.: Multibranch Adaptive Feature Fusion for Hyperspectral Image Classification, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3080, https://doi.org/10.5194/egusphere-egu26-3080, 2026.

EGU26-3363 | ECS | Posters virtual | VPS22

In-situ Thermal Infrared Monitoring in an Urban Area: A Case Study of Micro-scale Thermal Transitions during Hot Weather Conditions in Athens, Greece. 

Odysseas Gkountaras, Chryssoula Georgakis, Thiseas Velissaridis, and Margarita Niki Assimakopoulos

Characterizing the thermal state of urban surfaces is fundamental for mitigating the impacts of the Surface Urban Heat Island (SUHI) effect. This study presents an intensive in-situ thermal infrared monitoring campaign in the high-density urban core of Athens, Greece. Utilizing a calibrated handheld TIR sensor (7.5–14 μm), surface temperatures were recorded across strategic locations in the center of Athens during hot weather conditions. The methodology emphasizes the critical role of material-specific parameterization, where thermographic data were post-processed to account for emissivity (ε) variations and surface temperature, ensuring high-fidelity measurements.

Experimental results reveal extreme thermal stress, with maximum surface temperatures reaching 56.0°C on conventional paving materials, while the mean ambient air temperature was close to 35.0°C during peak solar hours (13:00–18:00LT). Spatial analysis and visualization of the results were performed using QGIS, correlating thermal signatures with urban geometry, shading conditions, and vegetation density. The aim of this study was to highlight the significant cooling potential of specific urban materials and nature-based solutions.

How to cite: Gkountaras, O., Georgakis, C., Velissaridis, T., and Assimakopoulos, M. N.: In-situ Thermal Infrared Monitoring in an Urban Area: A Case Study of Micro-scale Thermal Transitions during Hot Weather Conditions in Athens, Greece., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3363, https://doi.org/10.5194/egusphere-egu26-3363, 2026.

EGU26-3619 | ECS | Posters virtual | VPS22

Democratizing landslide detection for vulnerable regions beyond resource-intensive foundation models 

Rodrigo Uribe-Ventura, Willem Viveen, Ferdinand Pineda-Ancco, and César Beltrán-Castañon

Landslides claim thousands of lives and cause billions in economic losses annually, with impacts disproportionately concentrated in developing regions across Asia, Africa, and Latin America. Paradoxically, the current trajectory of artificial intelligence in geohazard detection—characterized by billion-parameter foundation models requiring substantial computational infrastructure—risks widening, rather than closing, the gap between technological capability and operational deployment where it is needed most. We argue that this paradigm requires fundamental reconsideration, proposing domain adaptation on strategically curated geological datasets as a more equitable and effective path toward globally accessible landslide detection systems.

Foundation models like the Segment Anything Model (SAM), pre-trained on over one billion masks, demand computational resources—312 million parameters, 1,376 GFLOPs per inference, specialized GPU infrastructure—that remain inaccessible to disaster management agencies in resource-constrained regions. Beyond these practical constraints, we contend that the apparent generalization capabilities of such models reflect pattern coverage in training data rather than emergent understanding transferable to geological contexts. The SA-1B dataset, despite its scale, was not curated to systematically represent landslide morphological diversity, creating coverage gaps for rare failure types, unusual triggering mechanisms, and underrepresented terrain configurations precisely where robust detection is operationally critical.

Given these limitations, we propose that effective generalization for geological applications emerges not from architectural scale but from strategic coverage of domain-relevant pattern space. We developed and tested GeoNeXt, a lightweight architecture that exploits the hierarchical transferability of geological features through targeted domain adaptation. Low-level representations (edges, spectral gradients) transfer universally across sensors and terrain; mid-level patterns (drainage networks, slope morphology) require adaptation to local expressions; and high-level configurations (failure geometries, trigger signatures) demand targeted training. Our results showed that this approach outperformed SAM-based methods across three independent benchmarks while requiring 10× fewer parameters (32.2M versus 312.5M) and a 62% reduction in computational cost. Zero-shot transferability to geographically distinct test sites (74–78% F1 score) emerged from the training dataset's systematic morphological diversity rather than parameter count. Inference at 10.6 frames per second on standard hardware, versus 3.0 frames per second for foundation model alternatives, transforms theoretical capability into deployable technology for resource-constrained environments. These findings suggest that strategic domain adaptation, rather than architectural scale, offers the most viable path toward operational landslide detection in vulnerable regions.

How to cite: Uribe-Ventura, R., Viveen, W., Pineda-Ancco, F., and Beltrán-Castañon, C.: Democratizing landslide detection for vulnerable regions beyond resource-intensive foundation models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3619, https://doi.org/10.5194/egusphere-egu26-3619, 2026.

EGU26-6022 | ECS | Posters virtual | VPS22

Geo2Gmsh: A Scalable Workflow for Automated Mesh Generation of Geological Models Using Gmsh 

Harold Buitrago, Juan Contreras, and Florian Neumann

Numerical modeling is a fundamental tool for understanding physically driven processes in geosciences. In multiparametric settings, the Finite Element Method is widely used because it can accommodate irregular geometries and complex boundary conditions. However, this advantage critically depends on the quality of the computational mesh, which must faithfully represent geological features such as faults, stratigraphic interfaces, and wells. In practice, mesh generation remains a major bottleneck, requiring specialized expertise and significant manual effort. We present Geo2Gmsh, an automated, lightweight workflow built on Gmsh (Geuzaine & Remacle, 2009), that generates geological meshes directly from simple text‐based descriptions of topological elements, including surfaces, lines, and points. These elements correspond to geologically meaningful features, allowing users to define faults, horizons, wells, and domain boundaries in a transparent, reproducible, and solver‐independent way. The workflow is demonstrated using two contrasting case studies: (1) Ringvent, an active sill‐driven hydrothermal system in the Guaymas Basin, and (2) the Eastern Llanos Basin, a foreland basin in eastern Colombia. To evaluate solver compatibility, we solved the heat equation in SfePy (https://sfepy.org/doc-devel/index.html) using the Eastern Llanos Basin model as the computational domain. Although the simulation is illustrative and not calibrated to observations, it confirms that meshes produced by Geo2Gmsh can be readily incorporated into numerical solvers. By explicitly embedding wells, faults, and geological interfaces in the mesh, Geo2Gmsh enables boundary conditions to be applied directly to physically meaningful features and allows model outputs to be extracted along them, simplifying both model setup and post‐processing. Meshes can be exported in standard formats (e.g., VTK, MSH, and Exodus via meshio), ensuring broad interoperability. Overall, Geo2Gmsh provides a lightweight, scalable, and reproducible workflow that dramatically lowers the technical barrier to geological mesh generation. This contribution establishes a practical foundation for reproducible, open-source numerical modeling in geosciences, facilitating the integration of geological knowledge into high-fidelity computational simulations.

How to cite: Buitrago, H., Contreras, J., and Neumann, F.: Geo2Gmsh: A Scalable Workflow for Automated Mesh Generation of Geological Models Using Gmsh, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6022, https://doi.org/10.5194/egusphere-egu26-6022, 2026.

EGU26-6232 | Posters virtual | VPS22

Application of advanced lossy compression in the NetCDF ecosystem for CONUS404 data 

Shaomeng Li, Allison Baker, and Lulin Xue

Many geoscientific datasets, such as those produced by climate and weather models, are stored in the NetCDF file format.  These datasets are typically very large and often strain institutional data storage resources. While lossy compression methods for scientific data have become more studied and adopted in recent years, most advanced lossy approaches do not work easily and/or transparently with NetCDF files. For example, they may require a file format conversion or they may not work correctly with “missing values” or “fill values” that are often present in model outputs.  While lossy quantization approaches such at BitRound and Granular BitRound have built-in support by NetCDF and are quite easy to use, such approaches are generally not able to reduce the data size as much as more advanced compressors (for a fixed error metric), like SPERR, ZFP, or SZ3.

We are particularly interested in reducing the data size of the CONUS404 dataset.  CONUS404 is a publicly available unique high-resolution hydro-climate dataset produced by Weather Research and Forecasting (WRF) Model simulations that cover the CONtiguous United States (CONUS) for 40 years at 4-km resolution (a collaboration between NSF National Center for Atmospheric Research the U.S. Geological Survey Water Mission Area). 

Here, we investigate one advanced lossy compressor, SPERR [1], together with its plugin for NetCDF files, H5Z-SPERR [2], in a Python-based workflow to compress and analyze CONUS404 data.  SPERR is attractive due to its support for quality control in terms of both maximum point-wise error (PWE) and peak signal-to-noise ratio (PSNR), enabling easy experimenting of storage-quality tradeoffs. Further, given a target quality metric, previous work has shown that SPERR likely produces the smallest compressed file size compared to other advanced compressors. It leverages the HDF5 dynamic plugin mechanism to enable users to stay in the NetCDF ecosystem with minimal to no change to existing analysis workflows, whenever a typical NetCDF file is able to be read. And, importantly for our work, the SPERR plugin supports efficient masking of “missing values,” which are common to climate and weather model output.  The support for missing values enables compression on many variables which are not naturally handled by other advanced compressors that rely on HDF5 plugins. Further, because H5Z-SPERR directly handles missing values, they can be stored in a much more compact format (and are restored during decompression), further improving compression efficiency. (Note that built-in NetCDF quantization approaches can work with missing values.) 

Our experimentation demonstrates the benefit of enabling advanced lossy (de)compression in the NetCDF ecosystem: adoption friction is kept at the minimum with little change to workflows, while storage requirements are greatly reduced.

 

[1] https://github.com/NCAR/SPERR

[2] https://github.com/NCAR/H5Z-SPERR

How to cite: Li, S., Baker, A., and Xue, L.: Application of advanced lossy compression in the NetCDF ecosystem for CONUS404 data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6232, https://doi.org/10.5194/egusphere-egu26-6232, 2026.

Investigations have been carried out into the initiation of the Pangu weather model, initiating the model with both ERA5 data (on which it was trained) and with the Met Office’s Global UM model data. There are many consistent local biases at ground level between these two sets of initial conditions. The geographically local biases are not dissipated by the Pangu model with timestep but instead remain geographically fixed and gradually decrease with lead time. Whilst the Pangu model initiated with UM initial conditions remains further from the ERA5 truth than the ERA5-initiated Pangu model at all timesteps, it initially moves towards the ERA5 truth with timestep, as the geographically static differences in initiation decrease, before moving further away from the ERA5 truth as differences in large-scale systems begin to dominate.

Also investigated was the difference between the Pangu model 24-hour timesteps and 6-hour timesteps; it was found that the 6-hour timesteps were better able to reduce the geographically static initial differences than the 24-hour timesteps.

If time permits, a similar analysis will be made of the FastNet and GraphCast models.

How to cite: Buttery, H.: Investigations into the Reaction of the Pangu ML Weather Model to Different Initial Conditions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7344, https://doi.org/10.5194/egusphere-egu26-7344, 2026.

EGU26-11945 | Posters virtual | VPS22

SEPNET: a multi-task deep learning framework for SEP forecasting 

Yang Chen, Yian Yu, Lulu Zhao, Kathryn Whitman, Ward Manchester, and Tamas Gombosi

Solar phenomena such as flares, coronal mass ejections (CMEs), and solar energetic particles (SEPs) are actively monitored and assessed for space weather hazards. In recent years, machine learning has demonstrated considerable success in solar flare forecasting. Accurate SEP forecasting remains challenging in space weather monitoring due to the complexity of SEP event origins and propagation. We introduce SEPNET, an innovative multi-task neural network that integrates forecasting of solar flares and CME summary statistics into the SEP prediction model, leveraging their shared dependence on space-weather HMI active region patches (SHARP) magnetic field parameters. SEPNET incorporates long short-term memory and transformer architectures to capture contextual dependencies. The performance of SEPNET is evaluated on the state-of-the-art SEPVAL SEP dataset and compared with classical machine learning methods and current state-of-the-art pre-eruptive SEP prediction models. The results show that SEPNET achieves higher detection rates and skill scores while being suitable for real-time space weather alert operations.

How to cite: Chen, Y., Yu, Y., Zhao, L., Whitman, K., Manchester, W., and Gombosi, T.: SEPNET: a multi-task deep learning framework for SEP forecasting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11945, https://doi.org/10.5194/egusphere-egu26-11945, 2026.

EGU26-13611 | ECS | Posters virtual | VPS22

Evaluating the combined potential of VSWIR and Thermal Infrared data for soil characterisation. 

Francesco Rossi, Raffaele Casa, Luca Marrone, Saham Mirzaei, Simone Pascucci, and Stefano Pignatti

Quantifying soil properties such as Soil Organic Carbon (SOC), texture, and Calcium Carbonate (CaCO3) is essential for assessing soil health and ensuring food security. While Visible, Near Infrared, and Short Wave Infrared (VSWIR) remote sensing is a standard operational tool, the Longwave Infrared (LWIR, 8-14 μm) offer complementary information on mineralogy and moisture that are still not yet fully explored for this specific application. This study investigates the synergy between VSWIR and LWIR data that will be available with future hyperspectral satellite missions. Among them, the European Space Agency's Copernicus Expansion missions that will add to the EO capacity the Hyperspectral Imaging Mission for the Environment (CHIME) and Land Surface Temperature Monitoring (LSTM) mission. Alongside are the NASA's Surface Biology and Geology (SBG and SBG-TIR) missions.

The research focuses on Jolanda di Savoia (Italy), an agricultural landscape resulting from land reclamation projects in the late 19th century. Ground truth data were collected during a field campaign on June 22, 2023, providing 59 topsoil samples further analysed for SOC, texture, and CaCO3. Field campaign was coincident with an airborne survey carried out with the LWIR Hyperspectral Thermal Emission Spectrometer (HyTES) sensor. HyTES captured data across 256 spectral bands from 7.5 to 11.5 μm, providing a pixel size of approximately 2.3 meters.

To evaluate the multi-frequency potential, we developed a workflow combining a soil composite from PRISMA (VSWIR) satellite time-series with simulated SBG-TIR (LWIR) data. The SBG-TIR simulation chain included as input a surface emissivity map derived from the airborne HyTES survey. To cover the LWIR wide spectral range (up to 12 µm), the emissivity spectrum was extended using an autoencoder neural network procedure trained on the ECOSTRESS Soil Spectral Library. Top-Of-Atmosphere (TOA) radiance was then simulated using the Radiative Transfer for the TIROS Operational Vertical Sounder (RTTOV-14) model, incorporating the optical depth and cloud/aerosol optical properties coefficients specific to SBG-TIR. Furthermore, these simulated data were atmospherically corrected to produce the target satellite emissivity products according to the TES algorithm.

Soil properties prediction models were developed using supervised machine learning algorithms. We benchmarked two scenarios: 1) the proposed combined approach using PRISMA and the simulated SBG-TIR L2 emissivity product; and 2) a VSWIR-only approach using PRISMA. A quantitative assessment by 10-fold cross-validation using common literature metrics (R², RMSE, RPD) highlighted the benefits of the multi-sensor approach. For SOC retrieval, the standalone VSWIR (PRISMA) model yielded an R2 of 0.55 (RPD = 1.5), while the synergistic integration of PRISMA with simulated SBG-TIR data improved the retrieval accuracy, reaching an R2 of 0.65 and increasing the RPD to 1.69. This work indicates that, on the agricultural test site of Jolanda di Savoia, the combined use of SVWIR and LWIR spectral range slightly improves the SOC retrieval. Further validation across diverse agricultural scenarios will be essential to test the real advantage of combining next-generation imaging spectroscopy missions.

How to cite: Rossi, F., Casa, R., Marrone, L., Mirzaei, S., Pascucci, S., and Pignatti, S.: Evaluating the combined potential of VSWIR and Thermal Infrared data for soil characterisation., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13611, https://doi.org/10.5194/egusphere-egu26-13611, 2026.

Accurate high resolution wind field prediction is essential for wind resource as-
sessment, renewable energy planning, and regional weather analysis. Although
Numerical Weather Prediction (NWP) models such as the Weather Research
and Forecasting (WRF) model provide physically consistent wind forecasts, their
outputs often suffer from systematic biases arising from uncertainties in surface
characteristics, simplified physical parameterizations, and resolution limitations.
Furthermore, increasing model resolution to the kilometer scale significantly
raises computational cost. To address these challenges, this study presents a
machine learning–based framework for bias correction of WRF-simulated wind
fields over the Southern Tamil Nadu region, with particular focus on the Mup-
pandal wind farm area.
An extensive validation of WRF configurations was first performed using mul-
tiple physics scheme combinations and domain setups, evaluated against ERA5
reanalysis data. The optimal configuration was identified and used to gener-
ate three years (2023–2025) of wind simulations at 3 km × 3 km resolution.
Significant biases were observed in the raw WRF outputs, motivating the appli-
cation of an Artificial Neural Network (ANN) based bias correction approach.
A Random Forest algorithm was employed for feature selection, followed by
Principal Component Analysis (PCA) to reduce dimensionality while retaining
95% of the variance. A feedforward neural network with multiple hidden layers
was trained to correct the U10 and V10 wind components, with the hyperbolic
tangent activation function yielding the best performance. The bias-corrected
wind fields exhibited substantial improvement in mean and extremes, achieving low error metrics and
strong correlation with ERA5 data.
The results demonstrate that combining physically based NWP simulations with
machine learning driven bias correction provides an accurate and computation-
ally efficient approach for generating high-resolution wind fields. This hybrid
framework offers significant potential for wind energy assessment and localized
meteorological applications in data-sparse regions.

How to cite: Pm, V. and Chakravarthy, B.: Bias Correction of Numerical Weather PredictionWind Fields in Southern Tamil Nadu RegionUsing Machine Learning Techniques, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16232, https://doi.org/10.5194/egusphere-egu26-16232, 2026.

EGU26-4129 | ECS | Posters virtual | VPS23

Rapid Turbulence Evolution Resulting from Stable Shear layer and Atmospheric Gravity Wave Interactions 

Abhiram Doddi, David Fritts, and Thomas Lund

Early laboratory experiments of shear flow by Thorpe (Thorpe, 2002) provided evidence of Kelvin-Helmholtz Instability (KHI) billow interactions either due to misaligned adjacent billow cores or varying phases along the adjacent billow axes. Similar evidence has been found in the observations of tropospheric clouds, airglow, and Polar Mesospheric Clouds (PMC) imagery data in the mesosphere. Initial High-Resolution Direct Numerical Simulations (DNS) studies performed at Reynolds Number of 5000 (Fritts et al., 2021a, Fritts et al., 2021b) have demonstrated the that misaligned KH billow cores exhibit strong and complex vortex interactions inducing ‘Tubes and Knots’ (T&K) structures (Thorpe, 2002). These T&K structures were observed to accelerate transition to small-scale turbulence in contrast to previously known notable transitional mechanisms such as secondary KHI and convective instabilities emerging in individual KH billows. Also, the KHI T&K dynamics evidently yield intense turbulence dissipation rates contrasting that of secondary KHI and convective instabilities in billow cores.

More recent high-resolution imaging of OH airglow (Hecht et al., 2021) provide concrete evidence of KHI billows with wavelength ranging between 7-10 km modulated by atmospheric Gravity Waves (GWs) of dominant horizontal wavelengths ∼ 30km and oriented orthogonal to KH billow axes and propagate along the billow cores which result in apparent T&K dynamics rapidly driving KH billow breakdown. Similar evidence has been found in recent PMC imaging. This is the central theme of the idealized DNS discussed in this talk.

We conducted DNS studies to demonstrate the turbulence energetics of KHI billow interactions when subject to modulations due to monochromatic atmospheric gravity waves of small perturbation amplitudes and intrinsic frequency of N/5 (where N is the background Brunt-Vaisala Frequency). Preliminary analyses of our DNS results indicate that GW modes with modest amplitudes promote KHI billow misalignments resulting in complex multi-scale T&K dynamics fixed at specific GW phases. An increase in the GW amplitude resulted in noticeable reduction of KHI billow wavelengths further promoting KH billow misalignments. The resulting turbulence is expected to consist of broader scale ranges of intense turbulence dissipation rate and diffusivity.

References
[Fritts et al., 2021a] Fritts, D. C., Wang, L., Lund, T. S., and Thorpe, S. A. (2021a). Multi-Scale Dynamics of Kelvin-Helmholtz Instabilities . Part 1 : Secondary Instabilities and the Dynamics of Tubes and Knots. pages 1–27.

[Fritts et al., 2021b] Fritts, D. C., Wang, L., Thorpe, S. A., and Lund, T. S. (2021b). Multi-Scale Dynamics of Kelvin-Helmholtz Instabilities . Part 2 : Energy Dissipation Rates , Evolutions , and Statistics. pages 1–39.

[Hecht et al., 2021] Hecht, J. H., Fritts, D. C., Gelinas, L. J., Rudy, R. J., Walterscheid, R. L., and Liu, A. Z. (2021). Kelvin-Helmholtz Billow Interactions and Instabilities in the Mesosphere Over the Andes Lidar Observatory: 1. Observations. Journal of Geophysical Research: Atmospheres, 126(1):e2020JD033414. Publisher: John Wiley & Sons, Ltd.

[Thorpe, 2002] Thorpe, S. A. (2002). The axial coherence of Kelvin–Helmholtz billows. Quarterly Journal of the Royal Meteorological Society, 128(583):1529–1542. Publisher: John Wiley & Sons, Ltd.

How to cite: Doddi, A., Fritts, D., and Lund, T.: Rapid Turbulence Evolution Resulting from Stable Shear layer and Atmospheric Gravity Wave Interactions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4129, https://doi.org/10.5194/egusphere-egu26-4129, 2026.

EGU26-4184 | ECS | Posters virtual | VPS23

A Multi-Criteria GIS Framework for Socio-Economic Drought Risk Assessment across India 

Arun kumar Beerkur and Hussain Palagiri

Socio-economic drought represents the stage at which water stress translates into tangible disruptions to livelihoods, infrastructure, and economic systems, often preceding severe physical water shortages. In India, pronounced climatic variability combined with extreme physiographic heterogeneity leads to strong spatial contrasts in socio-economic vulnerability to drought. Despite this, most drought assessments in the country remain dominated by hydro-meteorological indicators, with limited integration of socio-economic exposure, sensitivity, and adaptive capacity.
This study develops a spatially explicit socio-economic drought risk assessment framework for India by integrating multi-dimensional climatic, environmental, and socio-economic indicators within a Geographic Information System (GIS). Thirteen indicators capturing water availability, agricultural productivity, infrastructure, population pressure, economic activity, and social deprivation are compiled from multi-source datasets and harmonized to a common spatial resolution. The indicators include available soil water, agricultural yield, livestock density, road density, population density, biomass, electricity consumption, Gross Domestic Product (GDP), global surface water availability, digital elevation model, groundwater availability, land use/land cover, and relative deprivation. Indicator weights are objectively derived using the Analytic Hierarchy Process (AHP), with consistency of expert judgments ensured through the consistency ratio criterion (CR < 0.1). A GIS-based weighted overlay approach is then employed to generate a composite socio-economic drought risk index, which is classified into four risk categories to identify spatial patterns and hotspots.
The resulting risk map reveals pronounced regional disparities, highlighting drought-prone agrarian and socio-economically marginalized regions as areas of elevated risk. The proposed framework offers a transferable and scalable decision-support tool for integrating socio-economic dimensions into drought monitoring and preparedness. By explicitly linking water stress to livelihood and infrastructure vulnerability, the study provides actionable insights for risk-informed planning, targeted mitigation, and long-term drought resilience in India.

How to cite: Beerkur, A. K. and Palagiri, H.: A Multi-Criteria GIS Framework for Socio-Economic Drought Risk Assessment across India, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4184, https://doi.org/10.5194/egusphere-egu26-4184, 2026.

EGU26-4200 | ECS | Posters virtual | VPS23

Performance of Tapered Submerged Vanes in Mitigating Local Scour Around Bridge Piers 

Karmishtha Karmishtha, Rajesh Kumar Behera, and Gopal Das Singhal

Scour, defined as the erosion or removal of sediment from around bridge piers due to flowing water, remains one of the primary causes of hydraulic structure failures worldwide. Local scour around bridge piers poses a serious threat to bridge stability, particularly during high-flow events, as the development of downflow, horseshoe vortices, and wake vortices at the pier base leads to intense sediment removal and foundation instability. To address this challenge, the present study investigates the hydrodynamic behaviour and scour reduction performance of tapered submerged vanes installed upstream of a cylindrical bridge pier as an effective countermeasure against local scour. A combined numerical and experimental approach was adopted to evaluate the influence of tapered submerged vanes on flow structure and scour characteristics. Numerical simulations were carried out using FLOW-3D Hydro to analyse the three-dimensional flow field around the pier–vane system under steady clear-water conditions. The simulations focused on assessing velocity distribution, near-bed shear stress, vortex dynamics, and secondary flow patterns generated by the tapered vanes. Particular attention was given to the formation of leading-edge vortices (LEVs) and their role in modifying erosive flow structures near the pier foundation. Based on the numerical insights, a series of physical model experiments were conducted in a laboratory flume to quantify the scour reduction achieved by the tapered vanes. The experiments aimed to optimize the longitudinal and transverse placement of the vanes relative to the pier. The vanes were installed at a fixed longitudinal distance upstream of the pier, while transverse spacing was systematically varied to examine its effect on sediment transport and scour depth. Bed elevation profiles and maximum scour depths were measured after equilibrium scour conditions were attained. The results demonstrate that tapered submerged vanes significantly alter the near-bed flow field by generating localized leading-edge vortices that effectively deflect high-energy flow away from the pier base. This flow redirection weakens the horseshoe vortex and reduces near-bed shear stress in the vicinity of the pier. Among the tested configurations, the vane arrangement with a longitudinal spacing of 1.5D and transverse spacing of 2D exhibited the best performance, resulting in a 56% reduction in maximum scour depth compared to the no-vane case. Additionally, localized sediment deposition was observed upstream and downstream of the pier, indicating favourable redistribution of sediment induced by the vane-generated secondary currents. By integrating numerical modelling with experimental validation, this study provides valuable insights into the flow mechanisms and optimal placement strategies of tapered submerged vanes. The findings highlight their potential as a practical, efficient, and sustainable solution for mitigating local scour around bridge piers in alluvial channels.

Keywords: Scour, Submerged Vane, Horseshoe Vortices, Wake Vortices, Leading-Edge Vortex (LEV)

How to cite: Karmishtha, K., Behera, R. K., and Singhal, G. D.: Performance of Tapered Submerged Vanes in Mitigating Local Scour Around Bridge Piers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4200, https://doi.org/10.5194/egusphere-egu26-4200, 2026.

EGU26-4951 | ECS | Posters virtual | VPS23

CFD-Based Comparative Analysis of Conventional and Modified Piano Key Weirs for Improved Discharge Efficiency 

Anil Kumar, Ellora Padhi, and Surendra Kumar Mishra

The Piano Key Weir (PKW) has earned recognition for its adaptability for large discharges across weir types of varying heights and with small footprints. Therefore, it has the potential to be a substitute for linear weirs (space being a factor), Ouamane and Lempérière, (2006). Even with the above-mentioned advantages of PKWs, other geometries leave much to be desired. The rectangular PKW and the trapezoidal PKW illustrate a most common inefficiency example. Standard literature describes construction and operational shortfalls such as flowing separation at the inlet key, varying discharge and uneven velocities along the crest, vortex shedding and formation at the key intersections, dead zones in the inlet-outlet, zones of intensified energy dissipation, and lowering weir versatility at high flows. These challenges are combined to mean loss of efficiency in weir discharge capability. In response to these challenges, the present study introduces the Modified Piano Key Weir (MPKW) to assess its performance using 3D computational hydraulic modeling. The Volume of Fluid (VOF) methodology for free surface tracking and the Reynolds-Averaged Navier Stokes (RANS) for turbulence closure modeling characterize pressure gradients, flow accelerations in the several dimensions, and eddies. A systematic numerical investigation was conducted to compare the discharge efficiency of RPKW, TPKW, and MPKW across a range of steady inflow discharges: 0.030, 0.060, 0.090, 0.120, and 0.160 m³·s⁻¹. The MPKW demonstrated consistently superior discharge efficiency over both RPKW and TPKW for all tested cases, without requiring an increase in structural footprint or crest length. The highest relative improvement was observed at 0.060 m³·s⁻¹, which was therefore selected as a representative discharge for in-depth flow diagnostics. Discharge at 0.060 m³·s⁻¹ was applied to determine vorticity structures, turbulent kinetic energy (TKE), and energy dissipation to better understand the flow mechanisms that explain the efficiency of the weir. The MPKW design, with refined geometry and improved inlet–outlet design, rounded key transitions, and adjustable wall skew, was successful in mitigating flow separation at the key inlets and reducing the large-scale vortex formation at the key junctions. The modified sidewall skewed the internal recirculation, and as a consequence, TKE in the stagnation zones was less, and recirculation was more along the crests of the weir, thereby nullifying turbulent structures. While the breakdown of turbulence resulted in localized energy dissipation, the stabilization of the approach flow was improved because the process converted rotational energy of large eddies with a low energy loss to rapidly decaying eddies which do not sustain and produce a recycling of energy. Thus, less energy was concentrated in the vortex cells at the key junctions, the loss due to flow contraction was less, and the nappe cohesion over the crests was improved. MPKW, relative to other configurations, was characterized by a lower level of turbulence and vorticity at the junctions, a greater effective utilization of the crest, and improved pressure recovery. The results confirm MPKW as a hydraulically efficient and economically feasible solution for both new installations and retrofit applications under head or footprint constraints.

How to cite: Kumar, A., Padhi, E., and Mishra, S. K.: CFD-Based Comparative Analysis of Conventional and Modified Piano Key Weirs for Improved Discharge Efficiency, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4951, https://doi.org/10.5194/egusphere-egu26-4951, 2026.

EGU26-5098 | ECS | Posters virtual | VPS23

A Geospatial and AHP-Based Approach for Delineating Groundwater Potential Zones in Vulnerable Groundwater Systems 

Pavithra Belluti Nanjundagowda and Vamsi Krishna Vema

Groundwater is the second largest reserve of fresh water and is an important resource that supports agriculture, industrial and domestic water supplies. Groundwater is facing unsustainable impacts by human activities over the years in different forms. The situation is aggravated by climate change which aggravates groundwater stress through variable precipitation leading to reduced recharge. Thus, highlighting the importance of assessing aquifer potential for sustainable groundwater management. The analysis was carried out in the Manjra and Maner sub-basins, of Godavari river basin, India where data-driven assessments remain limited. In this regard, the present research employs a Multi-Criteria Decision Analysis (MCDA) framework that integrates Geographic Information Systems (GIS) and the Analytical Hierarchy Process (AHP) to define groundwater potential zones (GWPZ) in the Manjra and Maner sub-basins. In a GIS environment, eight thematic layers—geology, land use/land cover, lineament density, drainage density, rainfall, soil, and slope—were examined. These factors were weighted using AHP, and combined using weighted overlay analysis. Area under the Curve (AUC), Receiver Operating Characteristic (ROC) analysis, and groundwater inventory data were used to validate the final GWPZ map. Five classifications of groundwater potential were identified for the research area: very low, low, moderate, high, and very high. The research region's predominance of moderate (45%) to high potential (28%) zones suggests that groundwater availability is generally fair to good. Priority locations for sustainable groundwater development and management are indicated by the high and very high potential zones.

How to cite: Belluti Nanjundagowda, P. and Vema, V. K.: A Geospatial and AHP-Based Approach for Delineating Groundwater Potential Zones in Vulnerable Groundwater Systems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5098, https://doi.org/10.5194/egusphere-egu26-5098, 2026.

EGU26-5765 | ECS | Posters virtual | VPS23

Research on the mechanical behaviors of multi-fractured blocky rock masses 

Kuan Jiang, Chengzhi Qi, and Xiaoyue Hu

Deep rock masses have complex internal structures, which results in significant discreteness and blocky structures. With the increase in the depth of engineering construction and energy extraction, the unique pendulum-type waves emerge in deep blocky rock masses under the action of dynamic loads from mining and blasting, and they are characterized by low frequency, low velocity, large displacement amplitude and high kinetic energy, distinguishing them fundamentally from conventional seismic waves. Pendulum-type waves can induce alternating stress states of relative compression and separation within blocky rock masses, and may lead to rockburst disasters and even engineering-induced seismicity, thus posing great challenges to the safety of underground engineering such as tunnel construction and mining. In this paper, experimental research is conducted on the mechanical behaviors and typical characteristics of pendulum-type waves of multi-fractured blocky rock masses under static and dynamic loads. Firstly, the strength, deformation and failure mode were analysized based on uniaxial compression tests. The weak structural layers will significantly reduce the uniaxial compressive strength and enhance the ultimate deformation capacity of rock masses. Fractured rock masses have significant nonlinear deformation and may develop macroscopic fractures (vertical splitting failure, with the failure mode transitioning from brittle failure to ductile failure) at the stress level significantly lower than their uniaxial compressive strengths. Subsequently, based on the dynamic impact tests, the dynamic response, overall displacement, wave velocity and the mechanism of anomalously low friction were investigated, and the typical characteristics of pendulum-type waves, including the low frequency (177 Hz and 153 Hz in this experiment, which are much lower than the natural frequency of the rock itself), low velocity (about 900 m/s in this experiment, which is significantly lower than those of P-waves and S-waves), large displacement amplitude (it is more than two orders of magnitude larger than the deformation of an intact rock under an identical load) and high kinetic energy (The total kinetic energy accounts for 40% and 28% of the total energy in this experiment, which has its particularity and cannot be ignored) were quantitatively described. This study holds significant research importance for understanding the nonlinear waves in deep fractured rock masses and their dynamic behaviors, as well as for preventing and controlling engineering disasters in deep rock masses.

How to cite: Jiang, K., Qi, C., and Hu, X.: Research on the mechanical behaviors of multi-fractured blocky rock masses, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5765, https://doi.org/10.5194/egusphere-egu26-5765, 2026.

Predictability of river bank erosion in sinuous alluvial channels requires a combined study of the planform processes, hydraulics processes, sediment transportation, and the geotechnical properties of riverbanks. The research paper provides a detailed analysis of the evolution of channels within the Nabadwip-Kalna stretch of the Bhagirathi-Hooghly River (1990-2020). This analysis combines the synthesis of remote sensing, on-field surveys, lab experiments, and numerical model analysis into a multidimensional analysis. GIS was used through the Digital Shoreline Analysis System (DSAS) to measure changes on the bank-lines using historical satellite images of the same period of time. A two-dimensional migration coefficient (MC) model was used to model spatial-temporal changes in channel centrelines, and an RVR Meander was used to develop a model that takes into consideration depth-averaged flow velocity and reach-averaged hydraulic parameters. The characterisation of cross-sectional bathymetry and near-bank hydraulics was based on ADCP. The results of the geotechnical analysis showed that stratified streambanks showed critical shear stresses of 7.1-7.7 kPa, internal frictional angles of soils less than 30°–34°, and were predominantly affected by either cantilever collapse or piping as a result of varying maximum heights of streams between 5.7 and 6.8 metres. Bank stability through both BSTEM and BEHI was assessed, whereas sediment forecasting combined with SWAT to predict overbank flow and a Genetic Algorithm (GA) to estimate the total load. DSAS analysis on bank-line displacement revealed different erosion patterns within 170 transects, showing different RMSE of 0.090 to 0.162 in predicting zone boundaries. The MC method was able to model the 24-year centreline migration patterns, recording changes in the centreline-geometry parameters. Analysis of five cross-sections instrumented found instability and a factor-of-safety ratio of 0.81-0.95, resulting in 4.07-5.85m/yr and 4.35-7.15 km2/yr, respectively, lateral retreat and the eroded areas. Mean collapse rates were 0.125 to 0.198 m/yr, and the failure angle was 81°–87°. The maximum bank-failure mass was 41.24 kg (seasonal maximum), and the calibrated toe-scour mass was 0.28 kg. The GA model was tried using ten parameterisations and demonstrated the best prediction ability with the coefficient set at ten, where R2 = 0.96 and mean relative error (MRE) = 42% gave significantly better performance than the traditional regression analysis (R2 = 0.87 and MRE = 40%). There were also considerable changes in the area behind sandbar dynamics, that is, Nandai-Hatsimla increased by 11.87 ha in 1990 to 19.05 ha in 2020; Media by 39.7 ha to 57.68 ha; Char Krishnabati by 82.52 ha to 81.07 ha. Land-use/land-cover (LULC) predictions for 2040 indicated settlement expansion from 13.61% (2020) to 20.19%, with validation accuracy (RMSE = 0.253) confirming model reliability. This combined model shows that the combination of remotely sensed, field, laboratory, and model data provides quantitatively sound estimations of fluvial risks and forms the basis of evidence-based management of high-suspended riverine areas. The modular design can be applied to monsoon-dominated alluvial basins throughout the globe, which will promote adaptive land-use planning and long-term infrastructure development in the vulnerable riparian societies.

How to cite: Ghosh, A.: Unveiling integrated geo-hydraulic assessment of river meandering, bank erosion and sandbar dynamics in Alluvial channels, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7917, https://doi.org/10.5194/egusphere-egu26-7917, 2026.

EGU26-8596 | ECS | Posters virtual | VPS23

Waveform signatures of acoustic emission from thermally and mechanically induced microfracture in centrally apertured basalt 

Arthur De Alwis, Mehdi Serati, Arcady Dyskin, Elena Pasternak, Derek Martin, and David Williams

Acoustic emission (AE) monitoring is widely applied to track damage development in brittle rock, although relating recorded signals to specific fracture mechanisms can remain uncertain, particularly when comparing thermal and mechanical loadings. This contribution presents a preliminary assessment of AE waveform characteristics measured during two heating-only experiments and two uniaxial compressive strength (UCS) experiments performed on 100 mm diameter basalt specimens containing a central axial circular hole. This geometry provides a consistent configuration that promotes stress redistribution and damage localisation around an opening, allowing fracture processes to be compared within a common specimen form.

Full AE waveforms were acquired throughout each test using broadband piezoelectric sensors coupled to the specimen surface, with pre-amplification and digital acquisition. Event features were extracted in the time and frequency domains, including rise angle, duration, hit counts, average frequency, peak frequency, peak amplitude, and amplitude distributions. Feature-space comparisons were then used to evaluate whether thermally and mechanically induced microfracturing exhibit separable signal characteristics.

The thermal experiments were associated with a single dominant fracture initiating along the shortest ligament between the aperture boundary and the nearest specimen edge. In contrast, UCS loading produced a more complex fracture network consistent with mixed tensile and shear microfracturing. Rise angle versus hits per duration plots indicated that thermal events occupied a more restricted region, whereas UCS events displayed a broader spread, which may reflect greater variability in source processes during complex damage evolution. Frequency-based comparisons further highlighted the differences: thermally induced events clustered mainly within a lower-frequency band (approximately 100-300 kHz), while the UCS tests exhibited an additional higher-frequency population (approximately 400-600 kHz), alongside the lower-frequency component. Amplitude distributions were also differed, with thermal events tending toward a narrower amplitude range relative to the wider distribution observed under UCS loading. Collectively, these observations suggest that the combined time-domain, frequency-domain, and amplitude-based AE features support mechanism-informed discrimination between thermally driven tensile fracture and mechanically driven complex fracture networks providing a basis for subsequent statistical or learning-based classification in coupled thermomechanical experiments

How to cite: De Alwis, A., Serati, M., Dyskin, A., Pasternak, E., Martin, D., and Williams, D.: Waveform signatures of acoustic emission from thermally and mechanically induced microfracture in centrally apertured basalt, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8596, https://doi.org/10.5194/egusphere-egu26-8596, 2026.

EGU26-8813 | ECS | Posters virtual | VPS23

Assessment of Partial Blockage in Urban Drains for Flood Risk Reduction  

Aayusha Kumari Mishra, Hemant Kumar, and Rajendran Vinnarasi

Partial blockage in open channels and urban drainage systems is a common issue arising from debris accumulation, sediment deposition, and inadequate maintenance, often resulting in reduced flow capacity and increased flood risk. Despite its practical relevance, the hydraulic effects of partial blockage on flow behaviour are not well quantified through controlled experimental studies. This work aims to investigate the influence of partial blockage on flow characteristics in open channels and explore its implications for urban stormwater drainage systems.Laboratory experiments are carried out in a rectangular open-channel flume under steady flow conditions. Velocity measurements are obtained at multiple depths for unblocked conditions and for different partial blockage configurations. Blockages of varying size and location are introduced manually to represent realistic obstructions commonly observed in urban drains. The changes in velocity distribution, water depth, and flow-carrying capacity due to partial blockage are analysed to understand the hydraulic response of the system.

Based on these observations, relationships between blockage extent and hydraulic performance are developed to identify critical blockage conditions.The study framework is applied to urban stormwater drainage networks using SWMM modelling to extend the experimental findings to real-world applications. Blockage scenarios are simulated in selected channels to assess their impact on system performance and flooding behaviour.

The outcomes of this study provide experimental insight into blockage-induced hydraulic effects and highlight the importance of considering partial blockage in urban drainage analysis. The combined experimental and modelling approach offers a practical basis for improving flood risk assessment and maintenance planning in urban stormwater systems.

How to cite: Mishra, A. K., Kumar, H., and Vinnarasi, R.: Assessment of Partial Blockage in Urban Drains for Flood Risk Reduction , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8813, https://doi.org/10.5194/egusphere-egu26-8813, 2026.

Rising flooding, which is exacerbated by both climate change and human behavior, demands proper identification of vulnerable zones. Conventional hydrological analysis can neglect geographical variability. In this study, a combined geospatial and decision-making process is used to determine the levels of vulnerability and risk of flooding in the Koshi River Basin in the state of Bihar.  The research work has developed a susceptible, vulnerable and risk map by integrating GIS, Remote Sensing and AHP. Weightings of eleven physical and hydrological factors and five socio-economic indicators were carried out in a systematic manner using a multi-criteria decision-making framework that allowed appropriate consideration of their relative contributions to flooding. Flood susceptibility, vulnerability and risk maps were created using the GIS environment's Weighted Overlay technique. According to the analysis, population density (41.6%) and literacy rate (24%) are controlling factors for flood vulnerability in the basin, whereas rainfall (23.9%), elevation (14.7%) and drainage density are the main elements that influence flood susceptibility. The Koshi basin is largely covered by the low and moderate classes of flood susceptibility, whereas a very minor amount (0.03%) comes under the high susceptibility classes, according to results from flood susceptibility maps. A significant section (42.87%) of the basin has moderate flood susceptibility due to a combination of exposure and socioeconomic characteristics, according to the results of the flood vulnerability analysis. According to the flood risk results, a significant amount of the basin (84.18%) has moderate flood risk, while a tiny portion has high flood risk in the low-lying, heavily inhabited areas close to the basin's riverbanks.  ROC-AUC for model validation yielded an accuracy of 66.3% and proved that the proposed GIS-AHP model was a reliable. Conclusion from this study underscore an integrating role in both physical and socio-economic considerations with prospects of enhancement through climate scenarios in flood mitigation and planning/early warning maps.

How to cite: Chaudhary, P. and Padhi, E.: Flood Hazard Analysis and Risk Assessment of Koshi River, Bihar (India) using Remote Sensing, GIS and AHP Techniques , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8975, https://doi.org/10.5194/egusphere-egu26-8975, 2026.

EGU26-8985 | ECS | Posters virtual | VPS23

Non-linear rotational waves and complex rotation patterns in a chain of blocks with elbowing 

Maoqian Zhang, Arcady Dyskin, and Elena Pasternak

Block elbowing, the process in which rotating blocks push neighbouring blocks apart, influences both geological deformation and the stability of mining excavations in blocky rock masses. A clearer understanding of elbowing is essential for improving rock mass modelling and maintaining the safety of engineering structures. To this end, we analyse a chain of stiff blocks connected by springs, with one or two end active (driving) blocks – the blocks whose rotation is externally induced. All other - passive blocks - have translational and rotational degrees of freedom. The results show that block rotation is sequential (starting from driving blocks) producing a rotational wave with strongly configuration-dependent rotational patterns.

Opposite to a single driving block system, a double-driving block system exhibits more complex behaviour, as the active blocks may rotate in the same direction (Case I) or in opposite directions (Case II). In Case I passive blocks can exhibit anticlockwise rotation that is opposite to the clockwise rotating driving blocks, while in Case II all passive blocks do not rotate at all.

Further deformation patterns arise from block geometry, introduced by varying block corner rounding to represent spheroidal weathering. The results reveal a transition from reversible to irreversible passive block kinematics. Reversible responses include either clockwise rotation followed by full recovery or no rotation. The boundary between these types of block behaviour is defined by a linear relationship between the active-passive and passive-passive contact friction coefficients, with the intercept related to block corner rounding. In contrast, irreversible kinematics characterised by residual rotation emerge only for highly rounded blocks. This irreversible behaviour is restricted to short block chains and disappears in chains of five blocks suggesting a critical size of the Cosserat like zone with independent rotational degrees of freedom. This study provides new insights for modelling the stability and long-term evolution of blocky rock masses.

How to cite: Zhang, M., Dyskin, A., and Pasternak, E.: Non-linear rotational waves and complex rotation patterns in a chain of blocks with elbowing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8985, https://doi.org/10.5194/egusphere-egu26-8985, 2026.

The subject of this study is the process of hydraulic stimulation of a tectonic fault, leading to induced seismicity. We consider a scenario in which fluid injected near ​​an existing fault, causing a localized change in pore pressure and a reduction in effective stresses. This, in turn, initiates slippage of the fault segments and the formation of a slip zone, the size and slip velocity of which determine the magnitude of the resulting seismic events. The goal of this study was to develop a relatively simple model for estimating the potential magnitude of induced seismic events based on a limited set of governing parameters. The primary objectives of the study were to identify the key factors that have the greatest impact on the characteristics of the slip zone and to determine how fluid injection parameters (rate and injected fluid volume) affect earthquake magnitude by changing slip dynamics. The model obtained is based on the results of a series of numerical experiments analyzing the hydromechanical behavior of the fault under various injection conditions. The modeling was performed using a two-parameter rate-and-state friction law, which, unlike a single-parameter model, allows for a wider range of slip regimes to be simulated and accurately describes the transition from stable slip to dynamic failure.

The functional relationships were established between the initial system parameters and the key obtained slip characteristics. It was shown that the final slip zone length is almost linearly related to the length of the initial unstable zone, and the maximum slip velocity increases exponentially with increasing pore pressure rate. At the same time, in the area of high loading rates, the saturation of the sliding velocity is observed at a characteristic level, which leads to a limitation of the possible magnitudes of earthquakes induced by fluid injection.

How to cite: Turuntaev, S., Baryshnikov, N., and Riga, V.: Estimation of potential magnitudes of induced seismic events based on direct numerical simulation of fluid injection near an active tectonic fault., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11339, https://doi.org/10.5194/egusphere-egu26-11339, 2026.

EGU26-13831 | ECS | Posters virtual | VPS23

From Empirical Assumptions to Data-Informed Decisions: A Reliable Water Storage Soil Depth Estimation Method 

Damodar Sharma, Surendra Kumar Mishra, and Rajendra Prasad Pandey

Efficient water use in agriculture is crucial for sustainable water resource management, especially in areas experiencing increasing water scarcity. A critical yet often oversimplified component of irrigation planning is the estimation of water storage soil profile depth, commonly assumed to be 1-1.5 m as the root-zone depth based on practitioner experience rather than validated soil-water dynamics. Such assumptions introduce uncertainty and limit the reliability of irrigation scheduling decisions. This study presents a novel framework for estimating soil profile depth to store maximum water by integrating Richards’ equation, geotechnical soil column concepts, and the Soil Conservation Service Curve Number (SCS-CN) technique to derive an optimal soil profile depth that maximizes storage capacity based on measurable hydraulic and retention soil properties. By linking the water storage soil column depth with the SCS-CN parameter, for practical field applications such as irrigation scheduling and planning. The proposed framework improves model reliability and interpretability by replacing fixed-depth assumptions with soil-specific storage behaviour, thereby reducing uncertainty in irrigation water estimation. It enables consistent evaluation of field capacity, average soil moisture content, and maximum storage potential across soil types, leading to improved irrigation efficiency. By emphasizing physically constrained model selection, data-informed parameterization, and transparent decision-making metrics, this work enhances the reliability of hydrologic modeling and supports robust irrigation management under water-scarce conditions.
Keywords:  Water storage soil profile depth, Richards’ equation, Irrigation water management, Data-informed parameterization, SCS-Curve Number method.

How to cite: Sharma, D., Mishra, S. K., and Pandey, R. P.: From Empirical Assumptions to Data-Informed Decisions: A Reliable Water Storage Soil Depth Estimation Method, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13831, https://doi.org/10.5194/egusphere-egu26-13831, 2026.

EGU26-15102 | Posters virtual | VPS23

Anisotropic energy transfer rate quantified by LPDE and directional averaging methods in MHD turbulence 

Zhuoran Gao, Yan Yang, Bin Jiang, and Francesco Pecora

The energy cascade rate (ε) depicts the energy transfer in a turbulent system. In incompressible magneto-hydrodynamic (MHD)  turbulence, ε is linked to the third-order structure function (Yaglom vector) via the Yaglom/Politano–Pouquet law in the inertial range. In this study, we compare three estimators of ε in anisotropic MHD turbulence: (1) the lag polyhedral derivative ensemble (LPDE) technique that reconstructs the divergence of the Yaglom vector via tetrahedral linear gradients; (2) a directional-averaged third-order estimator that evaluates the Yaglom vector along a finite number of lag directions and averages over solid angle; and (3) the Yaglom vector on 60 degree with respect to the mean magnetic field direction.  To ensure a fair comparison in more realistic MHD turbulence, we emulate a multipoint virtual mission within anisotropic three-dimensional MHD simulations with a guide field B₀ along the z-axis. This work illuminates the reliable regime for LPDE and directional-averaging methods, and also tests whether 60 degree Yaglom vector is an accurate estimate of ε, providing practical guidance in both simulation and observational turbulence analysis.

How to cite: Gao, Z., Yang, Y., Jiang, B., and Pecora, F.: Anisotropic energy transfer rate quantified by LPDE and directional averaging methods in MHD turbulence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15102, https://doi.org/10.5194/egusphere-egu26-15102, 2026.

EGU26-16403 | ECS | Posters virtual | VPS23

Assessing the Impact of Digital Elevation Model Selection on Hydrological Predictions 

Prashant Prashant, Surendra Kumar Mishra, and Anil Kumar Lohani

Digital elevation models (DEMs) play a fundamental role in hydrological modeling by controlling watershed delineation, stream networks and runoff generation processes. This study assess the impact of global DEM product provided by Shuttle Radar Topography Mission SRTM and the Indian national CartoDEM developed by ISRO-Bhuvan (Indian Space Research Organisation-Bhuvan) on streamflow simulation using the Soil and Water Assessment Tool (SWAT) in the Ong River watershed (4650 sq. km), India. The study area is characterized by forest and cropland. Both DEMs, resampled to 30m resolution, were used as inputs to SWAT, along with meteorological data (IMD), land use/land cover data (Sentinel-2), and soil data (FAO). Streamflow data was sourced from Global Flood Awareness System discharge data (GloFAS). Model calibration (2011-2017) and validation (2018-2020) were performed using SWAT-CUP with the SUFI2 algorithm. Model performance was evaluated using Willmott's index of agreement, Nash-Sutcliffe Efficiency (NSE), R², PBIAS, and RSR. Results showed that both DEMs performed satisfactorily, with CartoDEM exhibiting slightly better performance (higher NSE and R², lower PBIAS and RSR) during both calibration and validation periods. Sensitivity analysis revealed that the runoff curve number was the most sensitive parameter, highlighting the impact of DEM selection on surface runoff simulation. The study concluded that CartoDEM is a preferable choice for hydrological modeling in similar catchments, though further research on stream accuracy and catchment delineation in diverse topographies can be explored.

How to cite: Prashant, P., Kumar Mishra, S., and Kumar Lohani, A.: Assessing the Impact of Digital Elevation Model Selection on Hydrological Predictions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16403, https://doi.org/10.5194/egusphere-egu26-16403, 2026.

EGU26-18007 | ECS | Posters virtual | VPS23

Effects of Flow Depth and Sediment Size on Near Bed Hydraulics and Sediment Mobility in Open Channel Flow 

Jyothi Banothu and Kamalini Devi

Accurate prediction of sediment mobility in open channel flows is essential for effective river engineering and sediment management. This study examines the combined influence of flow depth and sediment grain size on near bed hydraulics and sediment mobility using high-resolution Acoustic Doppler Velocimeter (ADV) measurements in a controlled laboratory flume. Experiments were conducted over uniform sand beds with median grain sizes of d₅₀ = 0.321 mm and d₅₀ = 0.81 mm under four different flow depths (12cm, 15cm,18cm,21cm) and a range of flow velocities. Three dimensional velocity components were measured at multiple vertical locations throughout the flow depth, while water surface elevations were continuously monitored. Depth resolved ADV data were used to compute mean streamwise velocity, Reynolds shear stress, friction velocity, and turbulent kinetic energy for each sediment size and flow depth. Sediment mobility was assessed using the Shields parameter, estimated from ADV-derived bed shear stress, and compared with the critical Shields parameter at multiple velocity points for each depth. The results indicate that coarser sediment beds exhibit increased near-bed turbulence intensity and higher friction velocity across all flow depths, while yielding lower Shields parameter values relative to finer sediment beds. Comparisons across the four flow depths reveal that sediment mobility transitions from stable to mobile conditions depending on the combined effects of flow depth, sediment size, and local velocity magnitude. At lower velocities, Shields parameter values remain below the critical threshold, indicating stable bed conditions, whereas higher velocities at the same depth result in Shields values exceeding the critical limit, signifying active sediment motion. Depth wise velocity and turbulence profiles demonstrate that both flow depth and sediment roughness significantly modify near-bed hydraulic structure and bed shear stress distribution. The findings highlight the importance of accounting for depth-dependent flow structure and sediment characteristics when evaluating sediment mobility. This study provides a robust experimental framework for identifying stable and mobile sediment regimes and estimating sediment transport potential using high-resolution ADV measurements without direct sediment transport observations.

How to cite: Banothu, J. and Devi, K.: Effects of Flow Depth and Sediment Size on Near Bed Hydraulics and Sediment Mobility in Open Channel Flow, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18007, https://doi.org/10.5194/egusphere-egu26-18007, 2026.

EGU26-18481 | ECS | Posters virtual | VPS23

Assessing urban surface flood resilience using hydrodynamic modelling under extreme rainfall conditions in urban catchment of Nepal 

Pushparaj Singh, Rahul Deopa, and Mohit Prakash Mohanty

Urban flooding poses a growing challenge for rapidly urbanizing cities, where climate change–driven increases in extreme rainfall, expanding impervious surfaces, and limited drainage capacity collectively exacerbate the frequency and severity of surface water inundation. In this context, understanding urban surface flood resilience, defined as the capacity of stormwater drainage systems to withstand, convey, and recover from intense rainfall events, remains essential for effective flood risk management and climate adaptation planning. The present study investigates urban surface flood resilience in Janakpur Sub-Metropolitan City, Nepal, a fast-growing urban center increasingly exposed to pluvial flooding. The study develops an integrated modelling framework using a 3-way coupled MIKE+ hydrodynamic model, integrated with intense spatial analysis using GIS, to evaluate the performance of the existing stormwater drainage system under extreme rainfall conditions. The model represents the urban drainage network and surface flow processes using drainage infrastructure data obtained from field surveys, terrain information derived from a high-resolution digital elevation model, and delineated urban catchments. To characterize rainfall extremes, the analysis employs long-term observed hourly rainfall records spanning 25 years to generate design storm events corresponding to multiple return periods. The modelling framework simulates system response for a representative extreme rainfall event and quantifies inundation dynamics across the urban landscape. The results shows that the coupled approach effectively captures critical flood hazard characteristics, including inundation depth, flow velocity, and the depth–velocity product, allowing for the spatial identification of highly vulnerable catchments and drainage bottlenecks. The findings provide actionable insights into the limitations of existing stormwater infrastructure and support the development of targeted adaptation strategies aimed at enhancing urban surface flood and drainage resilience. Overall, the study underscores the value of integrated hydrodynamic modelling for resolving location-specific flood behaviour and strengthening urban flood resilience assessments under evolving climatic and urbanization pressures.

How to cite: Singh, P., Deopa, R., and Mohanty, M. P.: Assessing urban surface flood resilience using hydrodynamic modelling under extreme rainfall conditions in urban catchment of Nepal, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18481, https://doi.org/10.5194/egusphere-egu26-18481, 2026.

EGU26-18820 | ECS | Posters virtual | VPS23

Global climate dynamics in a highly parameterized radiative-convective-macroturbulent energy balance model 

Adrian van Kan, Jeffrey Weiss, and Edgar Knobloch

We present a one-layer global energy balance climate model with highly parameterized radiation, convection, and large-scale atmosphere/ocean macroturbulence. Planetary heat content is parameterized by a 2D in latitude-longitude layer characterized by a temperature field and a uniform constant heat capacity. Radiation is parameterized by mean-annual zonal average top-of-atmosphere solar irradiance. Radiative heating and cooling are parameterized by a uniform constant albedo and Stefan-Boltzmann emission with uniform constant emissivity. Convection is parameterized by a temperature threshold for convection which restricts the layer from warming beyond the threshold, effectively cooling the layer. Macroturbulence is parameterized by 2D barotropic turbulence forced at small scales and damped by Rayleigh friction. Energy conservation is maintained by balancing the convective cooling of the layer with the turbulent kinetic energy forcing, resulting in tropical forcing, while the frictional loss of kinetic energy is balanced by frictional heating of the layer. The parameterized energy transforming processes are characterized by timescales, which, for Earth-like planets, are ordered as tradiation > tmacroturbulence > tconvection.

We investigate the model’s equilibrium climate state in terms of the meridional heat transport (MHT), the resulting zonally averaged temperature profile, and their fluctuations by simulating the system over many radiation times. For Earth-like parameters, despite the model’s extremely simplified dynamics, our simulations reveal a MHT profile comparable to the observed, annually averaged MHT on Earth, featuring a maximum in the mid-latitudes of approximately 5PW, a form of Bjerknes compensation. 

How to cite: van Kan, A., Weiss, J., and Knobloch, E.: Global climate dynamics in a highly parameterized radiative-convective-macroturbulent energy balance model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18820, https://doi.org/10.5194/egusphere-egu26-18820, 2026.

EGU26-19471 | ECS | Posters virtual | VPS23

From bilinear interpolation to machine learning: a comparative assessment of statistical downscaling methods for CMIP6 projections over Brazil 

Diego Jatobá Santos, Gilberto Goracci, Minella Alves Martins, and Rochelle Schneider

High-resolution climate projections are essential for climate impact, vulnerability, and adaptation studies, particularly over regions with strong spatial heterogeneity such as Brazil. Although CMIP6 global climate models (GCMs) provide valuable information on future climate change, their coarse spatial resolutions, typically ranging from 100 to 200 km, limit their direct application at regional and local scales. Statistical downscaling techniques offer computationally efficient alternatives to dynamical downscaling, but their relative performance and added value remain insufficiently assessed over Brazil.

In this study, we compare two statistical downscaling approaches applied to a subset of CMIP6 models previously evaluated by Bazanella et al. (2024) – 10.1007/s00382-023-06979-1 – and identified as skillful in representing Brazilian climate: (i) a bilinear interpolation method followed by percentile-to-percentile bias correction, and  (ii) machine learning–based downscaling approaches. The original GCM outputs are interpolated to a common high-resolution grid of 10 km × 10 km using bilinear weights, providing a physically consistent reference framework. In parallel, ML-based models are trained using historical GCM predictors and high-resolution reference climate datasets to learn nonlinear relationships and generate high-resolution climate fields.

The performance of both approaches is evaluated for the historical period in terms of mean climatology, spatial patterns, and variability. Future projections under the SSP2-4.5 and SSP5-8.5 scenarios are then analyzed to assess regional climate change signals and associated uncertainties. Results assess the extent to which ML-based downscaling provides added value relative to bilinear interpolation, particularly for variables with strong spatial heterogeneity, such as precipitation and temperature extremes, while also evaluating the ability of the approach to preserve the large-scale climate signals projected by the driving CMIP6 models. This comparative analysis provides insights into the applicability, robustness, and limitations of statistical and ML-based downscaling methods for regional climate assessments over Brazil.

How to cite: Jatobá Santos, D., Goracci, G., Alves Martins, M., and Schneider, R.: From bilinear interpolation to machine learning: a comparative assessment of statistical downscaling methods for CMIP6 projections over Brazil, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19471, https://doi.org/10.5194/egusphere-egu26-19471, 2026.

Accurate estimation of evapotranspiration (ET) is critical for various applications in hydrology and agricultural water management. However, direct observations of ET, specially its spatial variation, in time-consuming and cumbersome, thus necessitating the need to use of indirect methods for its estimation. In this study, stomatal conductance data is used in conjunction with bio-physical parameters of wheat crops for deriving the spatially varied estimates of ET (ETSC) for different irrigation treatments using the Penman-Monteith equation. For this, five treatments, including drip (DI) and flood (FI) irrigated treatments were used in the study, namely fully irrigated (DI)), 50% MAD (maximum allowable deficit) (DI), 50% MAD (FI), farmer fields replication (FI) and rain-fed treatment.

The ETSC estimates are also compared to the ET estimates derived using a method based on field water balance (ETWB). When compared with the ETWB values, the ETSC estimates compared well particularly for the irrigated treatments. The average root mean square error (RMSE) of ETSC estimates in comparison to ETWB values are 0.11, 0.2, 0.23 and 0.26 mm/day for fully irrigated, 50% MAD (FI), 50% MAD (DI) and farmers field replication treatments, respectively. The corresponding RMSE value (0.47 mm/day) for the rain-fed treatment are found significantly higher than the irrigated treatments indicating the limitation of the approach in high water stress conditions. The differences between ETSC andETWB values also increase significantly during the end-season stage when the wheat crop is close to maturity. Overall, the results demonstrate the robustness of the proposed approach in estimating the spatial variation of ET using the Penman-Monteith method in conjunction with the on-field field stomatal conductance observations.

How to cite: Upreti, H. and Yadav, M.: Evaluation of Penman-Monteith estimates of evapotranspiration derived using field-collected stomatal conductance observations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20031, https://doi.org/10.5194/egusphere-egu26-20031, 2026.

EGU26-21173 | Posters virtual | VPS23

Global Hot Spots of Climate Extremes from Composite Hazard Indices 

Natalia Zazulie, Francesca Raffaele, and Erika Coppola

Understanding the spatial distribution and intensity of climate-related hazards is essential for effective risk assessment and adaptation planning.  This study presents a comprehensive analysis of climate hazard indices applied across all IPCC reference regions, using all the available CMIP5-driven regional climate model (RCM) simulations at 25 km resolution over the CORDEX domains, together with Euro-CORDEX simulations at 12 km resolution. The objective is to identify climate hazard hot spots through the formulation of a composite hazard index. 

A subset of hazard indicators representing key climate extremes is selected. Temperature- and heat-stress–related hazards are characterized using TX90p (extreme maximum temperature), TN90p (extreme minimum temperature), and the NOAA Extended Heat Index (HI). Heavy precipitation and drought-related hazards are represented by RX1DAY (maximum 1-day precipitation), P99 (99th percentile of precipitation), and CDD (consecutive dry days).

The composite index integrates both the frequency and intensity of extremes and is computed at both regional and grid-point levels. A normalization approach is used to ensure comparability across regions with diverse climatic characteristics. Results reveal pronounced spatial heterogeneity in hazard intensity, highlighting regions where multiple hazards converge and amplify overall risk. This framework enables systematic identification of global and regional climate hot spots, offering insights into areas that may face heightened climate stress under current and projected conditions. By providing a consistent, region-wide assessment of hazard exposure, this study aims to support comparative climate risk analyses and inform policy-relevant decision-making for climate adaptation and resilience strategies at multiple scales.

How to cite: Zazulie, N., Raffaele, F., and Coppola, E.: Global Hot Spots of Climate Extremes from Composite Hazard Indices, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21173, https://doi.org/10.5194/egusphere-egu26-21173, 2026.

EGU26-21830 | ECS | Posters virtual | VPS23

Soil Moisture Based Calibration of a Hybrid Hydrological-Neural Network Model in Data Scarce Basins 

Khaoula Ait Naceur, El Mahdi El Khalki, Luca Brocca, Abdessamad Hadri, Oumar Jaffar, Mariame Rachdane, Vincent Simonneaux, Mohamed El Mehdi Saidi, and Abdelghani Chehbouni

Reliable river discharge simulation generally relies on observed streamflow data for model calibration; however, such observations are often uncertain or unavailable in data-scarce regions, limiting the applicability of conventional hydrological models. This study presents a hybrid modeling framework that uses soil moisture as an alternative calibration variable to improve discharge simulations in the absence of reliable streamflow observations. The framework couples a two-layer version of the daily lumped MISDc (Modello Idrologico Semi-Distribuito in continuo) hydrological model with a Feedforward Neural Network (FFNN), which is employed to enhance parameter calibration by exploiting soil moisture dynamics. The proposed approach is evaluated across three contrasting basins: Tahanaout in semi-arid Morocco, and Colorso (Italy) and Bibeschbach (Luxembourg) in temperate climates. Both in situ and ERA5-Land soil moisture datasets are used as calibration inputs. Model performance is assessed using multiple hydrological metrics, including Mean Absolute Error (MAE), Kling-Gupta Efficiency (KGE), and the correlation coefficient (R). Results show that the hybrid MISDc-FFNN framework substantially improves river discharge simulations compared to the traditional model. Across all basins, MAE is reduced by up to 61%, KGE increases by more than 200%, and R improves by up to 87%, with consistent performance gains observed for both observed and reanalysis-based soil moisture. These findings demonstrate the potential of soil moisture driven calibration strategies to enhance hydrological modeling in data-scarce environments, offering a viable pathway for improved water resources assessment and flood risk management where discharge observations are limited or unreliable.

 

Keywords: Soil moisture; river discharge simulation; hydrological modeling; machine learning; ERA5-Land; data-scarce regions; feedforward neural network

How to cite: Ait Naceur, K., El Khalki, E. M., Brocca, L., Hadri, A., Jaffar, O., Rachdane, M., Simonneaux, V., Saidi, M. E. M., and Chehbouni, A.: Soil Moisture Based Calibration of a Hybrid Hydrological-Neural Network Model in Data Scarce Basins, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21830, https://doi.org/10.5194/egusphere-egu26-21830, 2026.

Present-day practices of bridge piers design often employ group arrangements of piers in various configurations to modify flow dynamics and mitigate subsequent scour formation around the piers. These group arrangement configurations may vary in aspects of spacing ratio, number of piers, and orientations to alter the flow-structure interaction, and hence the scour development. Investigating the turbulent flow behaviour around various common group arrangements has been a topic of interest for researchers for a few years now. This study presents an experimental investigation aimed at comparing the equilibrium scour depth caused by various four-pier group arrangements. To assess the impact of spacing, the face-to-face distance between piers (G) was taken to values of D, 2D, and 3D, where D refers to the diameter of the circular pier. The scour patterns reveal that the maximum scour depth occurred when spacing G was equal to D. The equilibrium scour depth decreased with an increase in the pier spacing to 2D and 3D, corresponding to an approximate flow intensity of 0.9. The scour contours exhibit the impact of neighbouring piers and how it differs with an increase in pier spacing. Instantaneous velocity data were collected to derive the flow characteristics in the flow field. Velocity vectors depict the influence of different configurations on the flow pattern. The study provides an insight into the spacing effects on equilibrium scour, which can be useful in the design of pier group arrangements.

How to cite: Sahu, C.: Spacing Effect on the Equilibrium Scour and Flow Pattern around Four-Pier group in Different Configurations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22153, https://doi.org/10.5194/egusphere-egu26-22153, 2026.

EGU26-23043 | Posters virtual | VPS23

Analysis of Vector-Field Multifractal Cascades 

João Felippe Thurler Rondon da Fonseca, Daniel Schertzer, Igor da Silva Rocha Paz, and Ioulia Tchiguirinskaia
Multifractals provide a powerful framework to describe systems that exhibit variability over a wide range of scales together with strong intermittency. By encoding scale-dependent fluctuations through multiplicative cascades, multifractal models capture non-Gaussian statistics, heavy tails, and scale invariance in a compact and predictive manner. These properties have made multifractals particularly successful in the analysis of a wide variety of geophysical phenomena.
 
From the outset, multifractal fields have been formulated on domains of arbitrary dimension, allowing to represent space, space–time, or higher-dimensional parameter spaces. In contrast, the codomain of multifractal constructions has most often been restricted to scalar-valued fields. Although simpler for modeling and inference, the scalar setting omits directional information, anisotropy, and cross-component couplings that are essential in vector observations. Recent works, such as (Schertzer and Tchiguirinskaia 2020), have explored the use of Clifford algebras for constructing cascade generators, offering a natural algebraic framework to represent vector-valued multifractals while preserving their multiscale and symmetry properties.
 
In this work, we consider and simulate Clifford multifractal cascades as an extension of scalar models, capable of capturing directional variability and the internal geometry of multiscale fields. Rather than relying on a scalar stability exponent, we work in a framework where the stability can be encoded by algebra-valued or operator-like parameters, enabling anisotropic scaling and nontrivial coupling between different components of the Clifford field across scales.
 
To characterize the resulting operator-scaling structure, we extended the scalar analysis methods and developed inference methods that enable the direct estimation of multifractal parameters. Numerical experiments on synthetic cascades demonstrate that the proposed approach reliably recovers these parameters. The results demonstrate that extending multifractal analysis to vector-valued fields is both feasible and essential for the characterization of complex multiscale phenomena.

How to cite: Thurler Rondon da Fonseca, J. F., Schertzer, D., da Silva Rocha Paz, I., and Tchiguirinskaia, I.: Analysis of Vector-Field Multifractal Cascades, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23043, https://doi.org/10.5194/egusphere-egu26-23043, 2026.

ESSI1 – Next-Generation Analytics for Scientific Discovery: Data Science, Machine Learning, AI

Data-driven global weather models, such as GraphCast, have revolutionized medium-range forecasting but often exhibit systematic limitations in quantitative precipitation forecasting (QPF). Specifically, these models tend to produce over-smoothed blurry rainfall fields and underestimate localized extremes , primarily due to the inherent uncertainties in their reanalysis training data (e.g., ERA5) and the use of mean-squared-error-based loss functions.

To bridge the gap between coarse-resolution global AI forecasts and the need for precise, high-impact weather prediction, we introduce SynQPF-Net, a deep learning framework designed to synergize GraphCast’s dynamical background fields with high-resolution observational analyses. The model employs a dual-stream spatiotemporal encoder to process heterogeneous inputs: the 0.25o dynamical forecasts from GraphCast and the 0.0625o precipitation analyses from the China Meteorological Administration Land Data Assimilation System (CLDAS) . A specialized hybrid loss function, combining classification (Dice) and regression (Weighted MSE) objectives, is utilized to jointly optimize the spatial structure and intensity of precipitation.

Evaluated on warm-season events in Southern China, our approach demonstrates significant skill improvements. SynQPF-Net effectively sharpens the forecast, doubling the Critical Success Index (CSI) for heavy rainfall (>=10 mm) at the 6-hour lead time compared to the raw GraphCast output. Crucially, interpretability analysis reveals that the model learns physically consistent meteorological principles: it predominantly relies on extrapolating recent observational patterns for short lead times (<=12 h) and dynamically shifts its focus to large-scale circulation and moisture variables (e.g., 700 hPa specific humidity) as the forecast horizon extends. This work provides a validated pathway for correcting and downscaling global AI weather models, offering a robust solution for short-range extreme precipitation forecasting.

How to cite: Chen, D.: Bridging Global AI Models and Local Extremes: A Dual-Stream Framework for Correcting and Downscaling GraphCast Rainfall Predictions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1413, https://doi.org/10.5194/egusphere-egu26-1413, 2026.

AI-driven climate models are often criticized as “black boxes,” raising concerns about their credibility for scientific and policy-relevant decision making. Explainable artificial intelligence (XAI) is frequently proposed as a solution, focusing on identifying systematic relationships between model input and output data to characterize model behavior. This paper builds on prior work arguing that trust in both dynamical and AI models depends not on such input-output characterizations alone but on scientists’ component-level understanding of their models (O’Loughlin et al. 2025). Component-level understanding refers to scientists’ ability to point to specific model components or parts in the model architecture as the culprit for erratic model behaviors or as the crucial reason why the model functions well.

We argue that component-level understanding plays a distinctive role in establishing credibility because it expands scientists’ ability to answer a wider range of what-if-things-had-been-different questions. For example, when a model exhibits unexpected sensitivity or instability, component-level understanding enables scientists to ask (and design targeted tests to determine) whether the behavior would persist if a specific parameterization, architectural module, or physically informed constraint were altered. We see examples of this in CMIP, e.g., diagnosing the effect of a cloud microphysics scheme on a model’s climate sensitivity (Gettelman et al. 2019; Zelinka et al. 2020) and in AI-driven climate science as well, e.g., attributing model instability to particular architectural choices such as unconstrained neural network layers or inappropriate spectral representations (e.g., Beucler et al., 2019; Bonev et al., 2023). By linking model behavior to specific components or architectural features, scientists are better positioned to diagnose misbehavior, explore counterfactual scenarios, and explain why a model behaves as it does under varying conditions. This explanatory capacity enables scientists to establish credibility with decision-makers by demonstrating when, why, and under what conditions AI-driven climate models can be trusted.

Such explanations will inevitably be incomplete and context-dependent, particularly in complex models whose components interact in nonlinear ways and are often intended to represent emergent climate phenomena. Nevertheless, we argue that credibility is built through explanatory practices involving model successes and failures alike. We conclude by outlining several pathways for strengthening component-level understanding in AI-driven climate science: scientists may develop such understanding themselves; work in close collaboration with AI model builders and domain experts; design model intercomparison projects that explicitly support component-level diagnosis; or adopt evaluation and benchmarking practices that prioritize explanatory and counterfactual insight alongside predictive performance. On this view, establishing credibility requires organizing scientific work so that explanation remains a central and achievable activity.

References

Bonev, B., et al.: Spherical Fourier Neural Operators…, arXiv [preprint], https://doi.org/10.48550/arXiv.2306.03838. 2023.

Beucler, T., et al. Enforcing analytic constraints in neural networks…. Physical review letters, 126(9), p.098302. 2021.

Gettelman, A. et al. High Climate Sensitivity in the Community Earth System Model Version 2 (CESM2), Geophys. Res. Lett., 46, 8329–8337, https://doi.org/10.1029/2019GL083978, 2019

O'Loughlin RJ. Moving beyond post hoc explainable artificial intelligence… https://doi.org/10.5194/gmd-18-787-2025 2025

Zelinka, M. D., et al. Causes of Higher Climate Sensitivity in CMIP6 Models, Geophys. Res. Lett., 47, e2019GL085782, https://doi.org/10.1029/2019GL085782, 2020

How to cite: O'Loughlin, R.: Earning Credibility in AI-Driven Climate Science: The Role of Component-Level Understanding, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3879, https://doi.org/10.5194/egusphere-egu26-3879, 2026.

EGU26-4637 | Posters on site | ESSI1.1

GenEPS: A Generative Foundation Model for Probabilistic Weather Forecasting  

Congyi Nai, Xi Chen, Shangshang Yang, Ziniu Xiao, and Baoxiang Pan

Accurate weather forecasting is essential for a broad range of socioeconomic activities. While emerging data-driven models match numerical weather prediction accuracy with reduced computational cost, their deterministic nature overlooks uncertainties in initial state estimates, model systematic biases, and stochasticity arising from unresolved subgrid physical processes. This obliviousness results in over-confident deterministic predictions that render uncertainty quantification inaccessible, thereby limiting their utility for risk-based decision-making.

To address these challenges, we present the Generative Ensemble Prediction System (GenEPS), a framework that systematically explores uncertainties in initial states, model formulations, and model stochasticity. GenEPS functions as a foundation model that has explicitly learned the probability distribution of high-dimensional atmospheric states. It provides a plug-and-play solution for ensemble forecasting with arbitrary deterministic models. Specifically, GenEPS utilizes deterministic forecasts as conditions to perform generative sampling, producing an ensemble of states projected back into the realistic atmospheric phase space defined by ERA5. This stochastic sampling process quantifies uncertainties in initial conditions and forecast dynamics while ensuring physical consistency. Crucially, by treating each step as a re-initialization within the valid state space, the framework decouples state evolution from specific model formulations, enabling seamless cross-model integration to mitigate systematic biases.

By explicitly representing all three sources of uncertainty, GenEPS outperforms state-of-the-art numerical ensemble predictions and data-driven predictions when evaluated against ERA5 reanalysis data using both deterministic and probabilistic metrics. GenEPS also enhances extreme event predictions, offering physically consistent forecast fields. These advances establish a new paradigm in ensemble forecasting through multi-model generative integration, combining a surging number of data-driven weather forecasting models and potentially numerical models, to achieve more reliable predictions.

How to cite: Nai, C., Chen, X., Yang, S., Xiao, Z., and Pan, B.: GenEPS: A Generative Foundation Model for Probabilistic Weather Forecasting , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4637, https://doi.org/10.5194/egusphere-egu26-4637, 2026.

EGU26-4957 | Posters on site | ESSI1.1

A convolutional network learns about the North Atlantic storm track to predict heavy rainfall in Western Norway 

Robin Guillaume-Castel, Stefan Sobolowski, and Camille Li

Neural networks are powerful and widely used tools in weather and climate sciences, but their reliability under climate change remains uncertain as future conditions may be different from their training distribution. One way to build trust in these models is to assess whether they learn physically meaningful relationships rather than spurious correlations. Here, we present a case study investigating whether a simple convolutional neural network (CNN) predicts the occurrence of heavy rainfall in Western Norway for physically interpretable reasons. Since such rainfall is primarily associated with North Atlantic cyclones, we use explainable AI to assess whether the CNN identifies and uses the “correct” cyclones for its predictions.

Using ERA5 reanalysis data, we train a CNN to predict the occurrence of daily heavy rainfall events up to six days ahead from gridded wind and pressure fields. We apply layer-wise relevance propagation (LRP) to identify which regions of the atmospheric input fields contribute most to the model’s predictions. We find that model relevance is spatially aggregated into a small number of coherent patches, with one to three positive relevance patches dominating the prediction in more than 90% of the cases. Physical consistency is assessed by comparing the relevance patterns to objectively tracked cyclones. Interpreting cyclones as being “used” by the network when they spatially overlap with a patch, we show that cyclones contribute positively to the network’s predictions in about 95% of heavy rainfall events. In addition, we show that cyclones highlighted by the network are physically plausible; their trajectories follow the North Atlantic storm track, shifting from the western and central North Atlantic towards the eastern Atlantic and the Norwegian coast as the prediction lead time decreases. These results demonstrate that the CNN learns physically interpretable large-scale dynamics associated with North Atlantic cyclones, providing evidence that explainable AI methods can be used to assess and build trust in machine learning models for weather and climate applications.

How to cite: Guillaume-Castel, R., Sobolowski, S., and Li, C.: A convolutional network learns about the North Atlantic storm track to predict heavy rainfall in Western Norway, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4957, https://doi.org/10.5194/egusphere-egu26-4957, 2026.

EGU26-5252 | Orals | ESSI1.1

Retrospective reconstruction of 40 years of atmospheric fields in Northern Europe using the WeatherGenerator foundation model 

Cristian Lussana, Rolf Heilemann Myhre, Amélie Neuville, Even Marius Nordhagen, Ivar Ambjørn Seierstad, and Thomas Nils Nipen

Within the WeatherGenerator project, the Norwegian Meteorological Institute (MET Norway) is applying the project’s foundation model to reconstruct several decades -ideally the most recent 40 years- of atmospheric fields over Scandinavia. The primary objective of this work is to assess the potential of the WeatherGenerator framework for climate monitoring applications.

WeatherGenerator is a pan-European initiative that combines state-of-the-art machine-learning architectures with high-performance computing to develop an open, kilometer-scale foundation model of the coupled Earth system. The project is organized into four thematic areas; the application presented here is one of twenty-two applications developed by project partners within Theme 3.

The reconstructed datasets include near-surface atmospheric variables as well as variables at multiple pressure levels. The approach integrates heterogeneous data sources -ranging from in situ observations and reanalysis products to numerical model output- leveraging the foundation model to generate consistent, high-resolution fields suitable for climate and weather monitoring. The target spatial resolution is 1 km, achieved through data fusion techniques and a task-specific tail network trained to produce gridded analyses at this scale. Multiple temporal resolutions are explored, including hourly data and daily to monthly aggregations.

This contribution represents MET Norway’s first presentation of WeatherGenerator-related results at a scientific conference. The focus is therefore on preliminary results, outlining the overall methodological framework and demonstrating the potential of these novel approaches for high-resolution climate monitoring.

Note: The WeatherGenerator project (grant agreement No101187947) is funded by the European Union. Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the Commission. Neither the European Union nor the granting authority can be held responsible for them.

How to cite: Lussana, C., Heilemann Myhre, R., Neuville, A., Nordhagen, E. M., Seierstad, I. A., and Nipen, T. N.: Retrospective reconstruction of 40 years of atmospheric fields in Northern Europe using the WeatherGenerator foundation model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5252, https://doi.org/10.5194/egusphere-egu26-5252, 2026.

EGU26-6319 | Posters on site | ESSI1.1

Topology-preserving Feature-Space Analysis for Diagnostic Comparison of Weather Forecasting Models 

Hyoungnyoun Kim and Jeong Hoon Cho

This study proposes a novel diagnostic framework for systematically and intuitively evaluating the performance of medium-range weather forecasting models from a multivariate perspective. Traditional evaluation methods have primarily relied on point-wise error metrics (e.g., RMSE) for single variables at specific altitudes, which limits the analysis of inter-variable correlations and the dynamic evolution of forecast structures over lead times. To address these limitations, we present a methodology that integrates multivariate data into a shared, topology-preserving feature space, enabling the comparison and diagnosis of model-specific prediction trajectories.

The framework first represents multivariate atmospheric variables as images to extract semantic feature representations. To ensure robustness against spatial shifts and noise, we employ contrastive learning with data augmentation, effectively capturing the core physical characteristics of the atmospheric state. Subsequently, we apply a parametric manifold embedding specifically designed to preserve both the local neighborhood relationships of the high-dimensional feature space and its temporal continuity. This approach allows for a coherent and aligned comparison of prediction trajectories from diverse forecasting models within a unified coordinate system.

For the experimental setup, the feature space was defined using ERA5 reanalysis data from 2020 to 2024, with the 2025 ECMWF analysis serving as the reference ground truth. We analyzed a total of nine forecast configurations, combining three AI-based models (FourCastNet, GraphCast, and Pangu-Weather) with three operational numerical weather prediction initializations (IFS, KIM, and UM). By tracking trajectories at 6-hour intervals for up to 48 lead times, we visually analyzed model-specific dispersion and bias characteristics. Furthermore, the diagnostic validity of the framework was verified by comparing trajectory evolutions across different pressure levels and analyzing structural changes induced by varying variable compositions.

The proposed framework supplements conventional univariate and direction-agnostic metrics by enabling structure-aware, directional diagnostics in a multivariate feature space. It provides deep analytical insights into model-specific behaviors, serving as a critical diagnostic tool for future research on atmospheric pattern analysis and inter-variable correlation structures.

How to cite: Kim, H. and Cho, J. H.: Topology-preserving Feature-Space Analysis for Diagnostic Comparison of Weather Forecasting Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6319, https://doi.org/10.5194/egusphere-egu26-6319, 2026.

EGU26-6734 | Orals | ESSI1.1

A data-driven approach for classifying GCM errors and understanding their impact on climate projections. 

Tamzin Palmer, David Sexton, Anna-Louise Ellis, Douglas McNeall, and Georgie Mercer

Evaluating regional climate model performance often relies on simple error metrics that assume the largest mean errors matter most, sometimes combined with visual inspection of the large-scale atmosphere circulation fields, that introduces a priori assumptions. However, the influence of spatial error patterns in the large-scale circulation on other surface variables, such as precipitation, is complex, and their link to future projections remains uncertain.

We present a novel computer vision–based framework for regional climate model evaluation that emphasises interpretability and explainability. Rather than treating machine learning as a black box, our approach learns structural characteristics of seasonal mean sea-level pressure fields directly from CMIP6 model data. Using a convolutional variational autoencoder (CNN‑VAE), we construct an interpretable latent space in which large-scale atmospheric patterns cluster according to shared spatial structure.

These clusters enable the identification of systematic differences in how regional climate models represent key large-scale drivers of European climate. Deviations from reanalysis are quantified using simple distance metrics in latent space, allowing the magnitude of structural model errors to be directly compared in a physically meaningful way and without reliance on subjective visual assessment or pointwise error measures.

We further compare end‑of‑century precipitation and temperature projections from models occupying different regions of latent space to assess how distinct large‑scale circulation errors impact future surface climate projections. We present this methodology as a complementary approach to traditional error metrics that can be applied flexibly to any variable of interest in model evaluation.

By linking learned representations to physically interpretable circulation structures, this framework supports more trustworthy use of machine learning in climate model evaluation and provides new insights into how spatial error patterns may influence downstream variables and projections.

How to cite: Palmer, T., Sexton, D., Ellis, A.-L., McNeall, D., and Mercer, G.: A data-driven approach for classifying GCM errors and understanding their impact on climate projections., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6734, https://doi.org/10.5194/egusphere-egu26-6734, 2026.

EGU26-7283 | Posters on site | ESSI1.1

Integrating 'Trustworthy AI' Principles into Machine Learning for Aviation Weather Forecasting 

Thomas Chitson, James Fallon, Piers Buchanan, and James Shapland

Weather forecasting for aviation allows tens of thousands of flights to operate daily with prior warning of global hazards including in-flight icing, turbulence, convection, and fog. Machine learning (ML) methods have begun to be utilised across the aviation weather forecasting sector and can provide greater skill, lower false alarm rates, and cheaper running costs than conventional equivalent products. Often these products are built on existing numerical weather prediction techniques, but can also be standalone products that make predictions only based on observations. Aviation is a highly regulated and safety-critical industry, so weather forecasting products must meet stringent quality-control standards, and machine learning processes must be trusted by customers.

The Aviation Applications Team at the Met Office has developed a set of 'Trustworthy AI' principles that ML products must strive to adhere to. These principles have guided the recent development of a range of ML driven weather forecasting solutions for aviation including convective cloud detection at UK airfields, auto-TAF (Terminal Aerodrome Forecast) verification, and global convective forecasting capability. In each of these use cases the aviation sector end-users have been considered to ensure the products are trustworthy and explainable.

This study showcases a range of aviation weather forecasting case studies and how they have utilised trustworthy AI techniques including,  explainable AI (XAI), representative AI, and considered how existing 'research to operations' pipelines can be exploited to add trust to machine learning models. The research group has worked with the UK's aviation regulator, the Civil Aviation Authority, to consider what the industry requires to be able to use machine learning safely in UK aviation operations and what can be learned from the long-standing collaboration between the two organisations in developing trusted weather forecasting products.

Future challenges in operationalising ML driven weather forecasting products in the aviation sector include; sparsity of observations for some hazards, shifting baselines for long-term deployment of products, and regulatory hurdles for the approval of AI products.

How to cite: Chitson, T., Fallon, J., Buchanan, P., and Shapland, J.: Integrating 'Trustworthy AI' Principles into Machine Learning for Aviation Weather Forecasting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7283, https://doi.org/10.5194/egusphere-egu26-7283, 2026.

EGU26-7781 | ECS | Orals | ESSI1.1

Guiding the Forecast: Interpretability and AI Steering in Climate Science 

Philine Lou Bommer, Marlene Kretschmer, Anna Hedstroem, Fanny Lehmann, and Marina M.-C. Hoehne

While AI-based weather foundation models have revolutionized predictive capabilities, their opaque nature and susceptibility to training data biases pose significant challenges for operational trust. A prominent example is Aurora, a state-of-the-art foundation model that demonstrates exceptional hurricane tracking accuracy but consistently underestimates cyclone wind speeds. Because this bias is inherited from the underlying reanalysis data, standard retraining often fails to alleviate the systematic error.

In this work, we propose a novel paradigm for bias correction by adapting AI Steering, a technique recently established for monitoring and adjusting Large Language Model (LLM) behavior, to the domain of climate science. Rather than relying on traditional post-processing or computationally expensive retraining, steering allows us to interrogate and shift the internal neural representations of Aurora without modifying the underlying weights. By identifying the latent features associated with wind speed intensity, we can shift the model’s internal state to align more closely with high-resolution observations.

To evaluate this approach, we run forecasts initialized with IFS-HRES conditions and validate our results against IBTrACS observations. Our results demonstrate that this interpretability-driven approach helps to improve systematic biases by significantly reducing wind speed errors while preserving model integrity and maintaining Aurora’s high-fidelity track accuracy. Furthermore, we show that steering enables a form of "Human-in-the-Loop" oversight, providing a transparent mechanism for meteorologists to adjust model outputs based on physical constraints and domain expertise. By bridging the gap between LLM interpretability and AI-based weather forecasting, we highlight the potential of steering to improve operational forecasts and offer a scalable, transparency-first framework for diagnosing and mitigating failure modes in complex AI-based climate and weather models.

How to cite: Bommer, P. L., Kretschmer, M., Hedstroem, A., Lehmann, F., and Hoehne, M. M.-C.: Guiding the Forecast: Interpretability and AI Steering in Climate Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7781, https://doi.org/10.5194/egusphere-egu26-7781, 2026.

EGU26-10318 | ECS | Posters on site | ESSI1.1

Data-Driven Weather Scenario Generation for Long-Term Energy System Planning in the Nordic Region 

Even Nordhagen, Jesica Pinon Rodriguez, Kjetil Thøgersen, Fabio Zeiser, Erik Tjøtta, and Gaute Lappegard

Long-term energy system planning requires realistic weather scenarios that capture both short-term variability and long-term climate statistics, as well as rare but high-impact events. By preserving spatial and inter-variable correlations, we ensure robust multi-year energy market modelling in systems with large storage capacities, such as the Nordic power market. 

Current weather scenarios are based on ERA5 (Hersbach et al., 2020), where a period of 20 years (2003-2022) is used to establish synthetic weather scenarios eriod of 20 years (2003-2022) is used to establish synthetic weather scenarios (Martino et al., 2017). These weather scenarios consist of real weather but stitched together by different segments of 10 days pulled from the 20 years of samples.  Several statistical techniques, including quantile mapping, are applied during this process. However, this pipeline can introduce unphysical results and is both complex and time-consuming. In contrast, data-driven models offer a cost-effective solution for generating long-term forecasts efficiently.

In this study, the WeatherGenerator is employed to generate year-long independent weather scenarios by running the model under varying initial conditions. The analysis focuses on the Nordic region, where we evaluate the capability of the WeatherGenerator to reproduce long-term climate statistics for key variables. 

Its performance is benchmarked against weather scenarios produced by current in-house methodology and potentially alternative data-driven models such as AIFS or Bris.

Note: The WeatherGenerator project (grant agreement No101187947) is funded by the European Union. Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the Commission. Neither the European Union nor the granting authority can be held responsible for them.

 

(Hersbach et al., 2020) H. Hersbach, B. Bell, P. Berrisford, S. Hirahara, A. Horányi, J. Muñoz-Sabater, J. Nicolas, C. Peubey, R. Radu, D. Schepers, A. Simmons, C. Soci, S. Abdalla, X. Abellan, G. Balsamo, P. Bechtold, G. Biavati, J. Bidlot, M. Bonavita, G. De Chiara, P. Dahlgren, D. Dee, M. Diamantakis, R. Dragani, J. Flemming, R. Forbes, M. Fuentes, A. Geer, L. Haimberger, S. Healy, R. J. Hogan, E. Hólm, M. Janisková, S. Keeley, P. Laloyaux, P. Lopez, C. Lupu, G. Radnoti, P. de Rosnay, I. Rozum, F. Vamborg, S. Villaume, and J.-N. Thépaut, “The ERA5 global reanalysis,” Quarterly Journal of the Royal Meteorological Society, vol. 146, no. 730, pp. 1999– 2049, 2020

(Martino et al., 2017) S. Martino, T. N. Nipen, C. Lussana and S. Kolberg “A stochastic weather generator based on resampling historical ensemble weather forecasts and its application to hydrological simulation”, 2017, SINTEF Energi AS, ISSN: 1504-9795 

 

How to cite: Nordhagen, E., Pinon Rodriguez, J., Thøgersen, K., Zeiser, F., Tjøtta, E., and Lappegard, G.: Data-Driven Weather Scenario Generation for Long-Term Energy System Planning in the Nordic Region, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10318, https://doi.org/10.5194/egusphere-egu26-10318, 2026.

EGU26-11888 | ECS | Posters on site | ESSI1.1

Tropospheric Transport Simulation with the WeatherGenerator Prototype Model 

Belkis Asma Semcheddine, Savvas Melidonis, Martin G. Schultz, and Christian Lessig

Inadequate air quality is still a major cause of illness and premature death, and accurate air pollution analyses and predictions are needed to enhance human and animal well-being, protect natural and agricultural vegetation, and reduce climate change impacts. In a collaborative effort led by ECMWF and funded by the European Union, the WeatherGenerator, a foundation model for Earth system prediction, has been under development for nearly one year. In this study, we train and evaluate an early prototype of the WeatherGenerator model as first steps towards assessing how foundation models can represent chemical transport and regime behavior over extended forecast horizons. While some machine learning models, like Aurora [1], have demonstrated skillful 3-5 day predictions, air quality services require extended predictability windows (10-30 days) for strategic early warning systems and emission mitigation planning, especially for longer-lived species such as CO and background/regional O₃. WeatherGenerator's multi-resolution infrastructure simultaneously ingests datasets at different spatial and temporal resolutions without regridding, enabling integration of CAMS reanalysis chemistry data (0.75°, 3-hourly) with ERA5 meteorological information at synoptic scales (1°, 6-hourly). The model was trained from scratch on 2003-2021 observational data to forecast four reactive chemical species (O₃, CO, NO, NO₂) and three particulate matter size fractions (PM1, PM2.5, PM10). The training was conducted in two stages: pre-training the model on 2-step autoregressive rollouts, followed by fine-tuning on 8-step rollouts. Predictions span the entire tropospheric column including surface-level concentrations and 13 vertical levels from the boundary layer (1000 hPa) to the upper troposphere and tropopause region (50 hPa). We evaluate this proof-of-concept using 30-day autoregressive forecasts initialized from June 1, 2022. The trained model demonstrated stable 30-day continuous prediction of all species across all vertical levels, with comparable 5-day forecast skill to CAMS (at 0.75° resolution). Extended evaluation over June-November 2022 is currently underway to enable direct benchmark comparison. Notably, WeatherGenerator's training and fine-tuning required only 127 hours on 8 NVIDIA A100 GPUs. Ongoing work includes: (1) expanding training data to incorporate CAMS operational analyses at higher spatial resolution, (2) hyperparameter optimization, and (3) quantitative comparison to existing air quality forecast models to contextualize skill relative to operational CAMS and other baseline systems.

[1] Bodnar C, Bruinsma WP, Lucic A, Stanley M, Allen A, Brandstetter J, Garvan P, Riechert M, Weyn JA, Dong H, Gupta JK. A foundation model for the Earth system. Nature. 2025 May 21:1-8.

How to cite: Semcheddine, B. A., Melidonis, S., Schultz, M. G., and Lessig, C.: Tropospheric Transport Simulation with the WeatherGenerator Prototype Model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11888, https://doi.org/10.5194/egusphere-egu26-11888, 2026.

EGU26-13486 | ECS | Orals | ESSI1.1

Learning representations from different pre-training strategies in the WeatherGenerator  

Sebastian Hickman, Sophie Xhonneux, Ilaria Luise, Julian Kuehnert, Matthias Karlbauer, Kerem Tezcan, Yura Perugachi Diaz, Timothee Hunter, and Christian Lessig

In general, pre-training of large machine learning models uses self-supervised learning to generate expressive latent representations. These can then be used for downstream applications with little to no fine-tuning. The WeatherGenerator project follows this paradigm and aims to train a foundation model from a large number of weather and climate datasets to learn general and useful representations that may be used for a variety of downstream tasks, such as forecasting, downscaling or data assimilation. A wide variety of self-supervised tasks and training paradigms exist from other domains such as computer vision, that provide impressive performance. However, the extent to which these strategies transfer to atmospheric dynamics, and the physical sciences in general, has not been widely explored except for a few notable cases (Lessig et al., 2023, Parker et al., 2025).  

We explore how different pre-training approaches, including masked token modelling and student-teacher methods (Caron et al.,2021, Zhou et al, 2022, Assran et al., 2023), can be adapted to learn representations for atmospheric dynamics using reanalysis, forecast, and observation datasets. We then show how linear probing and small non-linear decoders can be used to evaluate the quality of the representations learned by different pre-training strategies. The relationship between the pre-training task and the quality of the representations learned for different tasks is explored. Finally, we illustrate the importance of including varied and representative datasets during pre-training and compare this to the specific pre-training method used. 

Parker, L., Lanusse, F., Shen, J., Liu, O., Hehir, T., Sarra, L., Meyer, L., Bowles, M., Wagner-Carena, S., Qu, H. and Golkar, S., 2025. AION-1: Omnimodal Foundation Model for Astronomical Sciences. arXiv preprint arXiv:2510.17960. 

Lessig, C., Luise, I., Gong, B., Langguth, M., Stadtler, S. and Schultz, M., 2023. AtmoRep: A stochastic model of atmosphere dynamics using large scale representation learning. arXiv preprint arXiv:2308.13280. 

Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., Joulin, A., 2021. Emerging Properties in Self-Supervised Vision Transformers. https://doi.org/10.48550/arXiv.2104.14294

Assran, M., Duval, Q., Misra, I., Bojanowski, P., Vincent, P., Rabbat, M., LeCun, Y., Ballas, N., 2023. Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture. https://doi.org/10.48550/arXiv.2301.08243

Zhou, J., Wei, C., Wang, H., Shen, W., Xie, C., Yuille, A., Kong, T., 2022. iBOT: Image BERT Pre-Training with Online Tokenizer. https://doi.org/10.48550/arXiv.2111.07832 

How to cite: Hickman, S., Xhonneux, S., Luise, I., Kuehnert, J., Karlbauer, M., Tezcan, K., Perugachi Diaz, Y., Hunter, T., and Lessig, C.: Learning representations from different pre-training strategies in the WeatherGenerator , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13486, https://doi.org/10.5194/egusphere-egu26-13486, 2026.

Foundation models have shown strong potential for data-driven weather and climate forecasting by supporting multiple tasks with limited task-specific engineering. The use of architectures that extract maximum value from large, heterogeneous datasets is important in this approach. WeatherGenerator follows this paradigm by learning from diverse observational and reanalysis sources to encode a latent representation of atmospheric dynamics. In this work, we examined the impact of integrating Mixture of Experts (MoE) layers, developed for large language models, into WeatherGenerator, and assessed how MoE can best be incorporated within its decoder architecture.

The motivation behind MoE is straightforward: during training, a router learns to assign tokens to specialized experts, allowing different parts of the decoder to focus on distinct spatial regions or physical regimes. We build on this idea by introducing spatially aware routing, in which geographic context is provided to the router, and by evaluating loss-aware routing strategies that favor experts by minimizing local prediction errors.

We evaluate four MoE decoder configurations, based on the use of spatial context and loss-aware routing, and compare them against the baseline model. Experiments are conducted using ERA5 reanalysis data, with performance measured using global RMSE and MAE for wind components (u, v), temperature (2t, t850), geopotential height (z500), and specific humidity over three 6-hour autoregressive forecast steps.

Across experiments, MoE architectures consistently improve performance for thermodynamic and large-scale variables. In particular, z500 RMSE is reduced by 26–31% at the first forecast step, with spatially aware routing performing best. Near-surface temperature shows a 7% improvement in RMSE and an 11% improvement in MAE when combining spatial and loss-aware routing. These improvements appear early in training within the first few epochs, indicating efficient use of the available data. On the other hand, MoE variants show limited or slightly negative effects at the second and last forecasting step when evaluating wind component variables, while the baseline performance shows similar or better results, especially on the last forecasting step.

These preliminary results indicate that MoE provides variable-dependent benefits, with notable improvements for slowly varying, large-scale thermodynamic fields, but less impact on highly dynamic momentum variables. Ongoing work will further assess performance across longer forecast horizons, different climatic regions, and training with multiple datasets from different sources.

How to cite: Almikaeel, W. and the WeatherGenerator Team: Mixture of Experts with Spatial Routing in a Weather Foundation Model: Early Results from WeatherGenerator, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14545, https://doi.org/10.5194/egusphere-egu26-14545, 2026.

EGU26-14846 | ECS | Posters on site | ESSI1.1

Subseasonal-to-Seasonal strategies for the Earth System Foundation Model (ESFM) 

Piotr Wilczyński, Fanny Lehmann, Firat Ozdemir, Salman Mohebi, Yun Cheng, Oliver Fuhrer, Siddhartha Mishra, Mathieu Salzmann, Benedikt Soja, Sebastian Schemm, and Torsten Hoefler

Foundation models for the Earth system have gained popularity, as they are starting to surpass numerical solvers in the accuracy of predicting Earth’s condition while requiring fewer computational resources. The Earth System Foundation Model (ESFM) contributes to this research direction by further extending the foundation models' flexibility.

The forecasting capabilities of ESFM are achieved in an autoregressive manner, using data from the t0 - Δt and t0 timesteps to produce a prediction for t0 + Δt. This approach is effective on weather timescales. Moreover, we find that it also delivers encouraging results for long-term forecasts, showing reasonable zero-shot subseasonal-to-seasonal (S2S) predictions (15–40 days). 

However, S2S predictions can be further improved while preserving weather skills. This work investigates strategies for this purpose. On such timescales, it is crucial to produce probabilistic predictions to better represent inherent uncertainty. Probabilistic predictions are realised with the introduction of multiple decoder heads (tails) for each variable. Each tail is intended to simulate a different possible trajectory, which, when combined, provides an estimate of the most probable outcome together with the spread of feasible values. To better estimate the distribution of possible values on the S2S timescale, additional trajectories are generated by running multiple predictive rollouts with different initial conditions.

Another strategy to improve S2S rollouts is to fine-tune the model to produce outputs for more distant steps. To this end, we leverage LoRA adapters (Hu et al., 2022), which are trained for each subsequent rollout step. This approach effectively improves predictive performance on long horizons, without significantly affecting training complexity or inference cost.

We also observe that some predictive variables of the model, such as climate forcings, are slowly evolving and can benefit from incorporating inputs from a more distant past than the t0 - Δt and t0 timesteps commonly used. To investigate this, we introduce an Attention Temporal Aggregator in the encoder, which leverages learned patch embeddings from an arbitrary number of previous timesteps and attends to those that are most informative for a given variable. In this way, for rapidly changing variables such as wind speed, the model focuses on the most recent data, whereas for slowly evolving variables such as sea surface temperature, it can utilise a broader range of inputs.

Overall, our experiments provide new insights into the development of foundation models for the Earth system, enabling improved predictions on S2S timescales, while conserving performance for weather forecasts.

References:
E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, et al. Lora: Low-rank adaptation of large language models. ICLR, 1(2):3, 2022

How to cite: Wilczyński, P., Lehmann, F., Ozdemir, F., Mohebi, S., Cheng, Y., Fuhrer, O., Mishra, S., Salzmann, M., Soja, B., Schemm, S., and Hoefler, T.: Subseasonal-to-Seasonal strategies for the Earth System Foundation Model (ESFM), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14846, https://doi.org/10.5194/egusphere-egu26-14846, 2026.

Foundation models trained on text and images are known to develop abstract internal features that align with human concepts, and that can be directly manipulated via activation steering in order to alter model behaviour. Whether scientific foundation models learn similarly abstract and domain-general representations has remained an open question. Inspired by recent work identifying single directions in activation space which control complex behaviours in LLMs, we show that a Walrus, a large physics foundation model, learns linearly steerable representations of physical phenomena. By computing the delta between activations representing contrasting physical regimes, we identify single directions in activation space that correspond to vorticity, diffusion, and even temporal progression. We find that injecting these concept directions back into the model during inference enables fine-grained causal control: vortices can be induced or removed, diffusion enhanced or suppressed, and simulations sped up or slowed down. Moreover, the concept directions we identified also appear to transfer successfully between unrelated physical systems, indicating that they are domain-general. These results suggest that scientific foundation models indeed learn general representations of physical principles and provides further evidence for the Linear Representation Hypothesis.

How to cite: Fear, R.: Physics Steering: Causal Control of Cross-Domain Concepts in a Physics Foundation Model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15558, https://doi.org/10.5194/egusphere-egu26-15558, 2026.

Machine Learning (ML) models, particularly deep neural networks, are often seen as black boxes, offering limited insight into how their predictions are made. This lack of transparency becomes especially important when ML is applied to critical domains such as numerical weather prediction, where traditional models are based on physical laws and differential equations.

Explainable AI (XAI) methods aim to address the black-box behavior by providing tools to interpret and understand model decisions. One such method is Layer-Wise Relevance Propagation (LRP), which traces the output of a neural network backward to assign relevance scores to input features based on their contribution to the prediction.

LRP has since been extended to Graph Neural Networks (GNNs) through the introduction of relevant walks, enabling interpretability in graph-structured data (GNN-LRP). These extensions have shown promise in areas such as image classification, sentiment analysis, and quantum chemistry. At the German National Weather Service (DWD), the AICON forecasting model employs a GNN architecture with message passing, similar in design to the GraphCast model.

In this work, we present an initial exploration of applying GNN-LRP to a simplified, toy version of a GNN model used as a representative of the AICON model. We investigate both saliency map-like visualizations and relevance walks, aiming to identify the most influential input features and their geographical location. While the current results are preliminary and limited in scope, this study tries to lay the groundwork for potential further research into explainability in graph-based weather prediction models.

How to cite: Pruschke, J. and Potthast, R.: Exploring Explainability for Graph-Based Weather Forecasting Models Using Layer-Wise Relevance Propagation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16587, https://doi.org/10.5194/egusphere-egu26-16587, 2026.

The rapid growth of wind and solar power is transforming energy markets, but their inherent variability makes accurate, real-time forecasting more essential than ever. Errors in day-ahead forecasting directly drive up imbalance costs, while the fast-paced nature of intra-day trading requires model inference that is much faster than traditional weather simulations. Foundation weather models such as the WeatherGenerator (WG) offer strong generalization and the potential for low-latency deployment, but their value for the energy sector depends on effective adaptation, as they are not originally designed for plant-specific tasks.

We will present results from applying WG to site-level wind and solar production forecasting in Turkey. The downstream task targets individual plants and is trained and evaluated on historical production observations across a multi-site portfolio. Our focus is on adapting WG for this operational setting by evaluating a spectrum of adaptation strategies, ranging from training task-specific 'tail' networks to fine-tuning the entire model. We report how these choices affect forecast performance and consistency across different sites and conditions, and we describe the resulting workflow in a form that can be carried over to portfolio-scale deployment.

Performance is benchmarked against our current operational baseline, which combines NWP results with machine-learning post-processing. We report MAE as the primary metric and discuss application-oriented indicators that relate forecast improvements to operational value in day-ahead and intra-day settings. The goal is to provide practical guidance on how to translate a foundation weather model into measurable benefits for renewable energy forecasting workflows.

How to cite: Afşar, A. M. and Bölükbaşı, G.: From foundation weather models to renewable operations: Adapting the WeatherGenerator for wind and solar production forecasting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16880, https://doi.org/10.5194/egusphere-egu26-16880, 2026.

EGU26-18011 | ECS | Orals | ESSI1.1

ESFM - A foundation model framework for heterogeneous data integration 

Firat Ozdemir, Yun Cheng, Salman Mohebi, Fanny Lehmann, Simon Adamov, Leonardo Trentini, Langwen Huang, Levi Lingsch, Zhenyi Zhang, Oliver Fuhrer, Benedikt Soja, Siddhartha Mishra, Torsten Hoefler, Sebastian Schemm, and Mathieu Salzmann

With increased availability of high quality diverse weather data, including reanalysis, satellite, surface stations, climate model data, the amount of data-driven foundation models (FM) in the environmental field has increased significantly over the past years with forecasting performances matching and sometimes exceeding physics-based numerical model predictions.  However, most FMs are trained with one dataset or a few datasets with similar sampling and/or resolution properties. While the proposed models achieve remarkable results with the datasets and variables they are trained on; it would be hard to anticipate similar performance under partially missing observations across different dimensions at test time. Similarly, typical design considerations risk limiting usage of these FMs to other heterogeneous datasets concerning the broader Earth sciences community.

We propose Earth System Foundation Model (ESFM), an FM capable of handling heterogeneous observations (i) across different resolutions, (ii) with spatially gridded and non-gridded nature, and (iii) with little to extreme sparsity. We achieve this through simple architectural design considerations and a masked training protocol. Namely, we bin similar ranges of grid resolutions together, while optimizing a different set of tokenizers for significantly different resolution bins to accommodate a single FM for observations across different resolutions. Similarly, we tokenize non-gridded data (i.e., station) separately with a single pixel patch size. Finally, we use variable specific tokenizers, coupled with learnable missing observation tokens, that allow ESFM to naturally accommodate for various subsets of available variables across different spatiotemporal positions. 

In this exploratory study, we show that ESFM is a flexible FM that can achieve impressive forecasting performance under different adverse setups with missing test data across any dimension on ERA5; spatio-temporal and inter-variable. We further test forecasting performance of ESFM in very sparse satellite imagery (3% pixel occupancy) data as well as station data. 

The proposed framework; also compatible for different backbone architectures than the one we experimented with; provides a general approach for integrating diverse Earth system data sources with varying resolutions, sampling patterns, and availability. This makes ESFM particularly relevant for the broader environmental sciences and Earth and space sciences, where challenges related to data heterogeneity and missing observations are central to the development of next-generation data-driven environmental modeling systems.

How to cite: Ozdemir, F., Cheng, Y., Mohebi, S., Lehmann, F., Adamov, S., Trentini, L., Huang, L., Lingsch, L., Zhang, Z., Fuhrer, O., Soja, B., Mishra, S., Hoefler, T., Schemm, S., and Salzmann, M.: ESFM - A foundation model framework for heterogeneous data integration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18011, https://doi.org/10.5194/egusphere-egu26-18011, 2026.

Dynamics within ecosystems vary in their transience and persistence. While some environmental drivers exert near-instantaneous control on productivity, others exhibit hysteretic responses in which the influence of past conditions persists beyond the moment of exposure. For instance, In gross primary productivity (GPP), incoming shortwave radiation predominantly regulates photosynthesis on a diurnal timescale, whereas temperature, vapour pressure deficit, and soil moisture may exert delayed or cumulative effects associated with acclimation, stress accumulation, and recovery. Accurately representing these temporal dependencies is essential for credible ecosystem modelling, yet remains challenging for both statistical and machine-learning approaches. 

Sequential models, such as Long Short-Term Memory (LSTM) networks, offer a flexible means of learning temporal dependencies directly from observations, without requiring predefined assumptions about lag structure. However, their opaque internal representations raise concerns regarding scientific interpretability and trust, limiting their use beyond prediction. Explainable AI (XAI) methods provide a method of interrogating these learned representations, enabling an assessment of how, when, and for how long different drivers influence model outputs.

Here, we investigate the use of Integrated Gradients (IG) in characterising the temporal structure of driver influence learned by LSTM models trained on combined meteorological and vegetation state datasets. Attribution is examined across input sequences to distinguish short-lived from persistent controls on GPP, and to assess how these patterns vary across environmental conditions and seasonal contexts. This analysis is contrasted with hypothesis-driven "exposure-lag" representations derived from Distributed Lag Non-linear Models (DLNM), highlighting differences in how temporal influence is represented, constrained, and interpreted.

Rather than treating explainability as a post-hoc validation step, this work demonstrates how XAI can function as a scientific diagnostic tool, enabling interrogation of black-box models and supporting exploratory discovery of emergent temporal behaviour. The results illustrate how explainable ML can enhance trust in data-driven ecosystem modelling while offering complementary insights to traditional confirmatory approaches, particularly in complex, high-dimensional settings where temporal dependencies are unknown or context-dependent.

How to cite: Hughes, T.: From Prediction to Understanding: Using Explainable AI to Reveal Temporal Drivers of Ecosystem Productivity, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19181, https://doi.org/10.5194/egusphere-egu26-19181, 2026.

EGU26-19620 | ECS | Orals | ESSI1.1

ESFM-MoE: Climate-semantic routing for Earth System Foundation Model (ESFM)  

Yun Cheng, Firat Özdemir, Salman Mohebi, Fanny Lehmann, Simon Adamov, Leonardo Trentini, Langwen Huang, Levi Lingsch, Zhenyi Zhang, Oliver Fuhrer, Benedikt Soja, Siddhartha Mishra, Torsten Hoelfer, Sebastian Schemm, and Mathieu Salzmann

Weather foundation models are increasingly expected to operate under heterogeneous and imperfect observation settings while remaining computationally scalable. Building on the Earth System Foundation Model (ESFM) setting for heterogeneous data integration, we explore how Mixture-of-Experts (MoE) can support robust and efficient learning in multi-modal weather foundation models.

We introduce ESFM-MoE, an exploratory direction that combines conditional computation with climate-semantic routing, a routing principle that encourages expert specialization aligned with meaningful geophysical structure, rather than treating expert selection as a purely generic scaling mechanism. The motivation is that Earth-system data exhibit strong spatial organization, regime-like variability, and modality-dependent uncertainties; MoE offers a natural way to allocate capacity adaptively and promote structured specialization under such heterogeneity.

In this work, we discuss the design space and practical considerations of integrating MoE into Earth-system foundation models, focusing on how routing objectives and inductive biases can shape expert behavior and improve utilization. We highlight potential benefits for robustness to missing observations, scalable training and inference, and outline promising directions for climate-aware expert specialization in next-generation weather foundation models.

How to cite: Cheng, Y., Özdemir, F., Mohebi, S., Lehmann, F., Adamov, S., Trentini, L., Huang, L., Lingsch, L., Zhang, Z., Fuhrer, O., Soja, B., Mishra, S., Hoelfer, T., Schemm, S., and Salzmann, M.: ESFM-MoE: Climate-semantic routing for Earth System Foundation Model (ESFM) , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19620, https://doi.org/10.5194/egusphere-egu26-19620, 2026.

EGU26-19816 | ECS | Posters on site | ESSI1.1

An Inherently-Interpretable Approach to Uncover the Head Importance of Attention Networks 

Ivica Obadic, Luca Rigon, and Xiaoxiang Zhu

Attention-based deep learning models are becoming a ubiquitous approach for modeling the complex temporal dependencies in many vital Earth observation applications, such as agricultural monitoring. They typically consist of multiple attention heads, with each head containing attention weights that determine how temporal information is combined for the model's prediction. While analyzing the attention weights can provide insights into the model's workings, the existence of multiple heads makes it difficult to comprehend the extracted temporal information by the model. To overcome this issue, we propose an inherently interpretable approach that automatically weights the head importance during the model's forward pass. Our evaluation on the task of crop-type classification shows that the model maintains high accuracy while simplifying interpretation by highlighting only the most significant attention heads.

How to cite: Obadic, I., Rigon, L., and Zhu, X.: An Inherently-Interpretable Approach to Uncover the Head Importance of Attention Networks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19816, https://doi.org/10.5194/egusphere-egu26-19816, 2026.

EGU26-20782 | ECS | Posters on site | ESSI1.1

Neural Compression of Remote Sensing Data for the Pre-Training of Geospatial Foundation Models 

Sebastian Hoffmann, Markus Zehner, Vitus Benson, Marieke Wesselkamp, Georg Martius, and Markus Reichstein

Over the course of a decade, a single Earth observation satellite mission, such as Sentinel-2, can generate more than 10 petabytes of data. While this wealth of information offers a unique opportunity for pre-training geospatial foundation models, storing and processing such massive datasets is challenging, even for powerful HPC systems. One potential solution is the use of lossy compression techniques, which remove irrelevant information (e.g., noise and redundancy) while preserving as much relevant content as possible. This approach enables significantly larger training datasets for self-supervised learning, potentially offsetting the loss in data quality and yielding performance gains in downstream tasks. However, at the time of writing, the application of lossy compression in this context remains largely underexplored.

Here, we use Vector Quantized Variational Autoencoders (VQ-VAE) for compression of Sentinel-2 and Sentinel-1 data. We show that VQ-VAE is able to achieve compression rates of up to 65x with minimal reconstruction errors. Compared to classic, general-purpose compression techniques such as JPEG-2000, the VQ-VAE model attains 2-3x higher compression rates for the same reconstruction error. We also present ablation studies on the use of compressed versus uncompressed data for pre-training masked autoencoders (MAE) under a fixed physical storage budget, reflecting the constraints of resource-limited HPC systems. Finally, inspired by previous work in computer vision, we explore using the learned quantization scheme to construct a probabilistic masked autoencoder. Instead of predicting a deterministic reflectance or backscatter value, our probabilistic model predicts a categorical distribution over the learned codewords and is trained using cross-entropy loss. This formulation naturally incorporates uncertainty or bimodality into the masked autoencoding task, for example under cloudy conditions.

Our results demonstrate the potential and feasibility of neural compression techniques for the pre-training of large geospatial foundation models. Looking ahead, we aim to incorporate our findings into the training of WeatherGenerator-Land, an upcoming multi-modal foundation model for Earth's land surface. WeatherGenerator-Land will be used for vegetation forecasting, predicting land-atmosphere interactions, and high-resolution land surface temperature forecasts, with a particular focus on heat waves and urban heat islands.

How to cite: Hoffmann, S., Zehner, M., Benson, V., Wesselkamp, M., Martius, G., and Reichstein, M.: Neural Compression of Remote Sensing Data for the Pre-Training of Geospatial Foundation Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20782, https://doi.org/10.5194/egusphere-egu26-20782, 2026.

EGU26-21352 | Posters on site | ESSI1.1

Precipitation nowcasting over Western Africa using transfer learning with WeatherGenerator 

Bart Schilperoort, Robin Richardson, Peter Kalverla, and Gijs van den Oord

Accurate nowcasting of high-intensity precipitation is essential for flood modeling, disaster management, and decision making. Due to the nature of precipitation, the intensity and timing can strongly vary spatially. While some areas of the world have dense networks of openly available automated weather stations or weather radars, these are not available everywhere. In sub-Sahara Western Africa, high-intensity precipitation has a high risk of causing hazardous flash floods, and with very little radar data available in the region, nowcasting is mostly restricted to available satellite products. 

Using WeatherGenerator atmospheric foundation model, we explore the viability of training a machine learning model to accurately nowcast heavy precipitation in Western Africa. We investigate fine-tuning the pre-trained WeatherGenerator to SEVIRI output, training a tail network that predicts rainfall retrieval from the MSG-CPP product. We also explore transfer learning with WeatherGenerator, using a decoder trained to EURADCLIM over the European continent with SEVIRI input and assessing its accuracy over the target region. 

This effort adds to our understanding of the flexibility and added value of WeatherGenerator as a foundation model for weather and climate. It also serves as a pilot for upcoming service projects that the WeatherGenerator consortium will offer to the earth-scientific community, focusing on a broad range of applications and stakeholders. 

How to cite: Schilperoort, B., Richardson, R., Kalverla, P., and van den Oord, G.: Precipitation nowcasting over Western Africa using transfer learning with WeatherGenerator, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21352, https://doi.org/10.5194/egusphere-egu26-21352, 2026.

EGU26-23208 | ECS | Orals | ESSI1.1

Evaluating explainable AI Methods for geoscientific regression: insights from applications and a chaotic toy model 

Ieuan Higgs, Kieran Hunt, Todd Jones, and Anna-Louise Ellis

As artificial intelligence (AI) systems transition from research prototypes to operational tools in Earth system science and forecasting, establishing confidence and trust in their predictions becomes increasingly critical. Although the inputs and outputs of AI models are observable, their internal decision-making processes are often highly complex and difficult for humans to interpret, leading to their frequent characterisation as “black boxes” which are potentially untrustworthy.

In this work, we examine a range of explainable artificial intelligence (XAI) techniques designed to provide insight into AI model predictions. Many of these methods have been developed primarily with classification tasks in mind, raising important questions about their suitability for the regression-based problems that dominate geoscientific applications. We investigate the application of XAI methods to a machine learning derived emulator of the Lorenz ’63 system (an archetypal chaotic dynamical model) and review existing case studies that apply XAI in regression settings relevant to Earth sciences.

We highlight key challenges and limitations of current, general-purpose XAI approaches when applied to chaotic, continuous, high-dimensional, and physically constrained systems. Finally, we identify gaps in existing methodologies and discuss future directions for developing XAI techniques better aligned with the context-specific needs of regression problems in geoscientific modelling and forecasting.

How to cite: Higgs, I., Hunt, K., Jones, T., and Ellis, A.-L.: Evaluating explainable AI Methods for geoscientific regression: insights from applications and a chaotic toy model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23208, https://doi.org/10.5194/egusphere-egu26-23208, 2026.

EGU26-7020 | Posters on site | ESSI1.2

FAIRenrich: Distributed semantic annotation at the repository edge 

Alexander Wolodkin, Claus Weiland, Jonas Grieb, and Robert Brylka

Senckenberg’s natural history collections encompass over 45 million physical specimens distributed across 11 facilities, with 1.6 million digitized records accessible in 124 collections. Additional digital objects are stored in various infrastructures, such as Edaphobase, a digital repository for harmonized soil information (physical, chemical, and biological), and the WildlIVE Portal, a platform for FAIR (Findable, Accessible, Interoperable, and Reusable) data sharing of biodiversity monitoring with edge sensors such as camera traps. Managing this heterogeneous landscape, ranging from legacy specimen data from the pre-digital era to newly digitized objects, presents significant challenges regarding legal compliance, data sovereignty, and the implementation of FAIR principles.

To address this volume, the FAIRenrich workflow automates the semantic annotation and maintenance of existing digital collection data. Handling the complexity of such enrichment requires AI models suited for partially non-deterministic tasks, incorporating an optional human-in-the-loop mechanism. By executing these workflows on a distributed network of stationary and mobile edge computing devices ('last-mile AI'), the architecture ensures strict adherence to data sovereignty and privacy requirements.

Beyond data curation, FAIRenrich's distributed architecture enables systemic efficiency through resource pooling informed by industry practice. Institutional edge infrastructure is rarely fully utilized; by networking heterogeneous devices, the system dynamically reallocates idle capacity to enrichment tasks. This mirrors industry practice: major technology operators, such as Google, systematically redeploy hardware from their refresh cycles into secondary-use programs.

FAIRenrich extends this model to legacy hardware: rather than treating end-of-life equipment as waste, the system enables cost-effective redeployment for delay-tolerant semantic enrichment tasks, such as inference workloads without strict latency requirements. By aligning workload scheduling to renewable peaks (e.g., photovoltaic installations), the approach implements carbon-aware scheduling principles used by major technology operators, achieving both infrastructure cost reduction and extended hardware lifecycles. This creates a circular-economy model for research institutions, transforming refresh-cycle surplus into productive scientific infrastructure.

This contribution demonstrates how FAIRenrich enables sustainable semantic annotation through a distributed edge architecture that simultaneously ensures data sovereignty, optimizes infrastructure utilization, and can realize cost-effective redeployment of legacy hardware. The approach exemplifies a scalable blueprint for research institutions seeking to decouple semantic enrichment from project-resource limitations through parallelization, temporal flexibility, and circular infrastructure practices.

How to cite: Wolodkin, A., Weiland, C., Grieb, J., and Brylka, R.: FAIRenrich: Distributed semantic annotation at the repository edge, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7020, https://doi.org/10.5194/egusphere-egu26-7020, 2026.

EGU26-12877 | ECS | Posters on site | ESSI1.2

EVE: An Open Source Earth Science LLM for Researchers, Policymakers, and the Public 

Àlex R. Atrio, Antonio Lopez, Jino Rohit, Yassine Elhouadi, Marcello Politi, Vijayasri Iyer, Sébastien Bratières, Umar Jamil, and Nicolas Longépé

Recent advances in Large Language Models (LLMs) have created opportunities to support reasoning, discovery, and synthesis in Earth Observation (EO) and Earth Sciences, provided domain specificity and reliability can be ensured. In this work, we introduce Earth Virtual Expert (EVE), a comprehensive open-source initiative to develop, evaluate, and deploy a domain-specialized LLM for EO. EVE serves as a testbed for studying domain-adaptive training, grounded generation, and evaluation strategies tailored to scientific use, rather than general-purpose conversational performance.

As part of this initiative, we present EVE-instruct, a text-only, instruction-tuned and aligned LLM specialized for EO. Built on Mistral Small 3.2 (24B parameters) with a 128k context window, it focuses on domain-specific reasoning, question answering, and retrieval– and hallucination-aware generation, without significant tradeoff of general capabilities. We release all data used to train and evaluate EVE-instruct: a large-scale curated EO corpus of 3B tokens, synthetically generated fine-tuning datasets derived from this corpus (4B tokens), and manually-created EO-specific evaluation test sets comprising 7500 samples across multiple-choice and open-ended question answering, and factuality test sets.

To support trustworthy usage and deployment, we further develop a Retrieval-Augmented Generation (RAG) database from the curated corpus and a hallucination-detection module focused on factual consistency and scientific grounding. These components are integrated with EVE-instruct and deployed with a graphical user interface and accessible via API, currently supporting more than 300 users from the EO research and industry field.

All models, datasets, and code are publicly released at: https://huggingface.co/eve-esa and https://github.com/eve-esa.

How to cite: R. Atrio, À., Lopez, A., Rohit, J., Elhouadi, Y., Politi, M., Iyer, V., Bratières, S., Jamil, U., and Longépé, N.: EVE: An Open Source Earth Science LLM for Researchers, Policymakers, and the Public, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12877, https://doi.org/10.5194/egusphere-egu26-12877, 2026.

Large language models (LLMs) promise to make systematic reviews more scalable and less costly, but the validity of LLM-assisted evidence synthesis depends not only on accuracy, but also on which parts of the literature are effectively visible to a deployed model and how reliably they are interpreted. We report a large-scale, domain-specific evaluation of an end-to-end LLM-assisted workflow for a systematic review of national-scale land use/land cover (LULC) prediction research (11,817 records; 11,688 after de-duplication), using a single hosted LLM deployment (Qwen Max) as a concrete case study. At title–abstract screening, the model behaved as a recall-oriented filter, excluding 9,891/11,688 records (84.7%) and routing 1,797 records for human follow-up; compared with the human baseline, it excluded fewer studies (84.7% vs 91.8%) and shifted more records into OK and POSSIBLE (OK: 4.2% vs 1.5%; POSSIBLE: 10.6% vs 5.5%). For full-text extraction, structured fields showed high agreement with expert coding across 342 benchmark papers (mean scores: 0.84 categorical, 0.85 temporal, 0.87 set-based), whereas free-text summaries were more variable (mean 0.79 overall; cosine similarity 0.51–0.87 across narrative fields despite high BERT-F1). In our case study, the workflow was completed in approximately one day on a single workstation for ~US$106 in API costs. Critically, full-text processing also produced explicit refusals: 7/2,084 candidate papers in deep screening and 2/345 papers targeted for insight extraction were blocked as “sensitive” geopolitical content. Although rare, these refusals were non-random and concentrated in contested regions, illustrating how LLM-specific constraints can introduce structured missingness that systematically removes or misinterprets evidence in precisely those settings where land-use conflict and governance are most salient. LLM-assisted reviews can therefore make previously prohibitive syntheses tractable. However, they must be embedded in transparent, human-led workflows that monitor and log model failures including refusals, omissions, and misreadings, and apply targeted auditing to detect and correct systematic blind spots.

How to cite: Derdouri, A. and Masago, Y.: LLM Workflow for Land-Use Prediction Evidence Synthesis: Efficient Screening, Selective Refusals, Reportable Gaps, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15955, https://doi.org/10.5194/egusphere-egu26-15955, 2026.

EGU26-19791 | ECS | Posters on site | ESSI1.2

Explaining dietary CO₂ impact with trustworthy agentic LLMs, ML, and XAI 

Marco De Carlo, Maria Mirto, Italo Epicoco, Paola Nassisi, and Maria Vincenza Chiriacò

Large Language Models (LLMs) offer transformative capabilities for scientific workflows, enabling scalable analysis, evidence synthesis, and insight generation. We present an agentic LLM workflow, applied to the EU-funded SWITCH project, which investigates the environmental impact of dietary choices, including CO₂ emissions associated with food consumption.

Validated questionnaire responses from SWITCH participants are securely anonymized and processed using machine learning methods, including clustering and classification, interpreted using ExaplainableAI (XAI) to ensure transparency of feature contributions, to generate food behavioral profiles, including nutritional and environmental habits, while preserving individual privacy. These outputs guide the construction of structured agent directives, enriching contextual information and constraining the LLM to provide scientifically grounded answers and data-driven insights.

Responses are generated within a Retrieval-Augmented Generation (RAG) framework over a curated Data Lake of revisioned documents, including project deliverables, scientific reports, and nutrition-environment datasets covering sustainable diets, CO₂ emissions, and European food policy. The combination of ML-generated profiles and the RAG context acts as a set of constraints, ensuring that LLM outputs remain traceable, grounded, and aligned with verified evidence.

Human-in-the-loop review ensures the quality and correctness of the ML-generated profiles, the construction of agent directives, the LLM outputs, and the revisioned documents used in the RAG framework, while metadata and traceability mechanisms ensure auditability, reproducibility, and risk mitigation.

Our results demonstrate that combining classical machine learning, structured agent directives guided by clustering and classification, RAG grounding, metadata and traceability, and human oversight enables trustworthy, effective, and transformative scientific analysis, highlighting the potential of agentic LLMs for scalable, insight-driven applications in research while ensuring responsible AI deployment.

How to cite: De Carlo, M., Mirto, M., Epicoco, I., Nassisi, P., and Chiriacò, M. V.: Explaining dietary CO₂ impact with trustworthy agentic LLMs, ML, and XAI, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19791, https://doi.org/10.5194/egusphere-egu26-19791, 2026.

EGU26-19885 | Posters on site | ESSI1.2 | Highlight

Making Scientific Data Accessible with LLMs while Preserving Authority and Reliability: Lessons Learned from Building Production Grade Agentic Systems 

Daniel Wiesmann, Jonas Solvsteen, Adam Pain, Alyssa Barrett, Ciaran Sweet, Ricardo Mestre, Daniel da Silva, Firza Riany, Fausto Pérez, Lane Goodman, Marc Farra, Soumya Ranjan, Tarashish Mishra, Sanjay Bhangar, and Sajjad Anwar

In this talk we will outline our learnings from building two production grade agentic systems for data discovery, retrieval and analysis. For both applications trustworthiness, reliability and reproducibility were key criteria that we took into account from the start.

Producing insights and surfacing data in a repeatable and transparent way is not simply a nice-to-have feature, it is indispensable for adoption. In practice, even if the answer from a chatbot is scientifically correct, users will only rely on the outputs for decision making if they trust the system. Users by now have enough experience with chatbots to know that LLMs tend to exaggerate and hallucinate. This has created a healthy skepticism with regard to output from agentic systems. In scientific domains, it is therefore not sufficient to guarantee a correct result, it also has to be presented in a way that is transparent and reproducible. Despite all advances in the LLM domain, this continues to be a challenge in building agentic systems.

We tackled this challenge by adopting a series of techniques that we will illustrate with concrete examples from building production ready agentic applications. One of the main principles is that we rely on the LLM mainly for orchestration of well known tools instead of relying on the generative capabilities of the models. For analysis we built the systems in a way that the transformations on the original data are reproducible. One technique is to use LLMs to write code for analyis that can be stored and used to reproduce the results. This allows end-to-end tracing of where the data is coming from and how it was transformed to produce insights through statistics and charts. We will also mention our approach to evaluation of the agent and share insights from early user research already performed for these systems.

We will illustrate these principles with concrete examples from the two agentic systems outlined below. 

The Destination Earth Digital Assistant, built in collaboration with ECMWF, provides general information about Destination Earth and helps users to discover and retrieve data from the DestinE Digital Twins.

Global Nature Watch, built in collaboration with WRI and the Land and Carbon Lab, provides governments, companies, and communities with trusted, open data and intelligence-driven insights on land conditions and land-use change to enable efficient and evidence-based decisions for nature protection and restoration.

GNW is publicly accessible today and the DestinE Assistant is also planned to be launched publicly before EGU26.

How to cite: Wiesmann, D., Solvsteen, J., Pain, A., Barrett, A., Sweet, C., Mestre, R., da Silva, D., Riany, F., Pérez, F., Goodman, L., Farra, M., Ranjan, S., Mishra, T., Bhangar, S., and Anwar, S.: Making Scientific Data Accessible with LLMs while Preserving Authority and Reliability: Lessons Learned from Building Production Grade Agentic Systems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19885, https://doi.org/10.5194/egusphere-egu26-19885, 2026.

The exponential expansion of academic literature across complex environmental domains has created a gap where the volume of research outpaces human capacity for effective integration. While Large Language Models (LLMs) offer a transformative solution to bridge this gap, their deployment in rigorous scientific inquiry is frequently compromised by model stochasticity, potential for hallucination, and the opacity of automated reasoning. Addressing the critical imperative for dependable and reproducible AI, this study presents a robust workflow designed to ensure methodological rigor and evidential integrity in the rapid and reliable synthesis of large-scale scientific literature.

We operationalised this framework within the domain of urban water management, specifically to analyse complex Knowledge Transfer (KT) strategies from a corpus of over 1,500 unstructured articles. To mitigate the risks inherent in generative AI, we developed a multi-layered validation protocol. First, we deployed an AI-assisted screening mechanism to filter the initial corpus down to 115 highly relevant articles, ensuring data relevance. Second, we implemented a Human-in-the-Loop design to iteratively synthesise a comprehensive analytical framework. By refining LLM-generated insights against domain expertise, we consolidated 24 operational attributes that specifically characterise the operational mechanisms of learning strategies from the corpus, preventing ungrounded inference while capturing emerging learning dynamics. Third, we addressed model variability through iterative Multi-LLM Triangulation (utilising Gemini, ChatGPT, and Deepseek). By repeatedly coding the 115 articles with the framework, we quantified qualitative insights to analyse how distinct learning strategies manifest their operational mechanisms. Finally, we employed Multiple Correspondence Analysis (MCA) and Hierarchical Clustering (HAC) to analyse the quantified results, categorising the eight identified learning strategies into three distinct clusters based on their functions and usage contexts, thereby effectively harnessing the LLM-generated insights.

Beyond this specific application, this research contributes a methodological blueprint for responsible AI integration in scientific inquiry. It demonstrates that combining theory-driven constraints with statistical verification is essential to elevate LLM-generated insights to the standard of reproducible scientific evidence.

How to cite: Wang, C., Corzo, G., Van Oel, C., and Zevenbergen, C.: Toward trustworthy AI in systematic reviews: a statistically validated AI-augmented framework for analysing knowledge transfer strategies in urban water management, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20948, https://doi.org/10.5194/egusphere-egu26-20948, 2026.

EGU26-20967 | Posters on site | ESSI1.2

Automating Requirements Elicitation using Large Language Models and Speech Processing 

Zahra Fardhosseini, Andrea Ackermann, and Beate Oerder

Problem:

The elicitation of user requirements represents a critical first step in successful planning software projects, particularly when a diverse set of use cases must be considered. In practice, this process is often carried out through oral interviews and handwritten notes, which is time-consuming, error-prone, and makes structured processing and subsequent analysis difficult.

Approach:

We present an integrated, automated pipeline framework that supports the systematic collection and analysis of user requirements, from capturing user perspectives, to model-supported analysis. The goal is to gather requirements consistently across different stakeholder roles, including project lead, technical staff, and data scientists. In the first step, requirements are collected using a web-based, structured questionnaire. In the second step, the questionnaire serves as a guideline in follow-up interviews for detailed case description to further refine the requirements.

Figure 1. the pipeline for requirements elicitation, LLM-based analysis, and human-in-the-loop review.

The interviews are recorded and automatically transcribed using a domain-adapted Language Recognition Component (LRC) based on open-source Automatic Speech Recognition (ASR) models. The resulting transcripts are combined with questionnaire responses and initial analysis artifacts, such as charts and diagrams, and processed within a Large Language Model (LLM) pipeline. After requirements have been collected, the pipeline supports the systematic inspection of individual requirements and their consideration in project planning.

Using a dedicated prompt schema, the LLM-based analysis supports the identification of functional and non-functional requirements, highlights open needs, clusters related issues, and organizes the results according to relevant work contexts. A human-in-the-loop review module enables targeted corrections, quality assurance, and iterative improvement of the analysis results.

 

Implementation test:

To validate our end‑to‑end requirements‑engineering pipeline, we applied it to the IACS‑AI Data‑Management‑Remodeling project. A web‑based survey (Nov–Dec 2024) yielded 53 responses, giving an initial, structured view of user requirements. Subsequently, we held seven interviews with 18 participants (project managers, engineers, data‑scientists), producing > 460 min of video afterward transcribed with the freeware tool Scraibe. 

Prompt‑engineering routine fed these inputs to a Large Language Model (Llama 3.3), which detected semantic clusters, classified requirements, and identified problem statements. For each requirement, we kept the highest‑probability class for further review.  The resulting insight shaped the next milestone: the design and implementation of a data‑pipeline architecture that fulfills the extracted functional and non‑functional requirements. 

Conclusion:

The reproducible design of the pipeline ensures traceability by documenting when, by whom, and in which context requirements were expressed, as well as how project decisions are derived from them. This results in a lightweight yet structured approach to requirements elicitation that improves transparency as well as reproducibility and reduces manual effort and errors.

Because the pipeline is generic, it is ideal for contexts with many stakeholders, heterogeneous use cases, and strong documentation‑traceability needs therefore besides our scientific implementation test it can also be utilized in the field of enterprise software, AI‑driven data projects, e‑government systems, and regulated domains such as healthcare or finance. 

 

How to cite: Fardhosseini, Z., Ackermann, A., and Oerder, B.: Automating Requirements Elicitation using Large Language Models and Speech Processing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20967, https://doi.org/10.5194/egusphere-egu26-20967, 2026.

EGU26-22068 | Posters on site | ESSI1.2

Collaborative Agent Reasoning Engineering (CARE): A Structured Methodology for Systematically Engineering AI Agents for Science 

Rahul Ramachandran, Nidhi Jha, and Muthukumaran Ramasubramanian

We present Collaborative Agent Reasoning Engineering (CARE), a disciplined methodology for engineering Large Language Model (LLM) agents in scientific domains. Unlike ad-hoc trial-and-error approaches, CARE specifies behavior, grounding, tool orchestration, and verification through reusable artifacts and systematic, stage-gated phases. The methodology employs a three-party workflow involving Subject-Matter Experts (SMEs), developers, and LLM-based helper agents. These helper agents function as facilitation infrastructure, transforming informal domain intent into structured, reviewable specifications for human approval at defined gates. CARE addresses the "jagged technological frontier", characterized by uneven LLM performance, by bridging the gap between novice and expert analysts regarding domain constraints and verification practices. By generating concrete artifacts, including interaction requirements, reasoning policies, and evaluation criteria, CARE ensures agent behavior is specifiable, testable, and maintainable. Evaluation results from a scientific use case demonstrate that this stage-gated, artifact-driven methodology yields measurable improvements in development efficiency and complex-query performance.

How to cite: Ramachandran, R., Jha, N., and Ramasubramanian, M.: Collaborative Agent Reasoning Engineering (CARE): A Structured Methodology for Systematically Engineering AI Agents for Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22068, https://doi.org/10.5194/egusphere-egu26-22068, 2026.

EGU26-1347 | ECS | Posters on site | ESSI1.4

Contextual Aware Hybrid Deep learning framework: Assessment with Auxiliary and Ancillary Data 

Vikash Kumar and Bharath Haridas Aithal

Semantic segmentation is the foundation of a wide range of practical applications, such as urban planning, climate modeling, and environmental protection, all of which have direct socio-economic implications. However, the accelerating densification of metropolitan regions in developing countries complicates accurate mapping of fine-scale urban land uses, as three-band optical imagery often fails to capture spectral variability and the restricted capacity of the CNN-based model to establish spatial and inter-band relationships. Therefore, to address these limitations, we propose a multi-modal architecture built on a SegFormer-B2 backbone. The pipeline integrates auxiliary datasets of DEM for surface information, SWIR for capturing water absorption characteristics, and an ancillary dataset of built-up layers for enhanced urban boundary delineation, along with multi-temporal false-color composites from LISS-4 and Sentinel-2 over the Bengaluru region. The proposed framework integrates convolutional feature extraction with transformer attention to jointly learn local spectral–spatial patterns and global cross-band dependencies.  Attention-guided up-sampling, a hybrid loss function, and cross-attention modules are incorporated to strengthen feature fusion across heterogeneous modalities by establishing a link between the multi-band synergy of the Auxiliary and Ancillary datasets. Empirical evaluation reveals consistent qualitative improvement and higher overall accuracy, with substantial gains for Barren land when incorporating SWIR and vegetation, and when integrating DEM. These results validate the effectiveness of the proposed framework in overcoming spectral insufficiency and spatial ambiguity, as it outperforms baseline models. Overall, the proposed approach offers a scalable and transferable solution for private developers and government agencies seeking robust, fine-resolution mapping to support a sustainable and structured urban environment.

 

Keywords: Urban mapping, Deep learning architecture, Spectral Feature extraction, Performance Optimization

How to cite: Kumar, V. and Haridas Aithal, B.: Contextual Aware Hybrid Deep learning framework: Assessment with Auxiliary and Ancillary Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1347, https://doi.org/10.5194/egusphere-egu26-1347, 2026.

EGU26-1610 | Posters on site | ESSI1.4

Continental-Scale Prospectivity Modelling of Volcanogenic Massive Sulphide Deposits in Europe Using a Mineral-System and Explainable Machine-Learning Framework 

Maria Dekavalla, Sergio Tenorio Matanzo, Martin López Del Río, Chrysoula Papathanasiou, and Angelos Amditis

Growing demand for critical raw materials over the coming decades underscores the need for robust, continental-scale frameworks to identify new mineral resources. Volcanogenic massive sulphide (VMS) deposits supply Cu, Zn, Pb, Au, and Ag, and they play a crucial role in meeting Europe’s growing demand for strategic raw materials. Despite Europe’s long mining history and extensive geological datasets, mineral prospectivity assessments remain largely restricted to national boundaries, limiting the ability to evaluate mineral systems that operate across regional tectonic domains. This study develops the first integrated European-scale prospectivity model for VMS by merging harmonised public geoscience datasets within a mineral-system and machine-learning (ML) framework. This work is carried out as part of the EU-funded TERRAVISION project, which aims to enhance the entire critical raw materials value chain towards implementing sustainable mining practices.

The modelling approach builds on and extends existing regional-scale frameworks by addressing several persistent challenges in regional-scale exploration. A positive–unlabelled training strategy was used to mitigate the lack of reliable negative labels, and ML models capable of estimating uncertainty, along with multiple explainability techniques, were applied. To ensure that predictors capture meaningful geological processes, both data-driven and knowledge-based feature selection were implemented. Model explainability was evaluated through three complementary approaches: (i) built-in feature importance from the ML classifier, (ii) permutation feature importance to assess the robustness of predictor influence, and (iii) SHapley Additive exPlanations (SHAP) values to quantify local and global predictor contributions. Together, these methods provide transparent, interpretable insights into the geological and geophysical variables that indicate prospectivity patterns. The model successfully identified over 97% of known VMS deposits and occurrences with spatial patterns showing strong correlation between high-probability areas and established mineralisation. Importantly, they also highlight prospective trends in regions with limited documented exploration.

The analysis highlights several metallogenic zones that exhibit geological and geophysical signatures consistent with favourable mineral-system conditions, but where known deposits are sparse. These areas represent potential greenfield opportunities at a continental scale. The study also illustrates the value of applying the mineral system concept to regional datasets. Harmonised lithological data and spaceborne geophysical data contribute significantly to mapping crustal-scale structures and tectonic domains with a history of submarine seafloor volcanic activity, a key requirement for VMS formation. More broadly, the proposed framework is transferable to other deposit types and illustrates the strategic potential of continental-scale, process-informed and explainable ML approaches to strengthening Europe’s strategic raw-material knowledge base through consistent, process-informed regional assessments.

How to cite: Dekavalla, M., Tenorio Matanzo, S., López Del Río, M., Papathanasiou, C., and Amditis, A.: Continental-Scale Prospectivity Modelling of Volcanogenic Massive Sulphide Deposits in Europe Using a Mineral-System and Explainable Machine-Learning Framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1610, https://doi.org/10.5194/egusphere-egu26-1610, 2026.

The aim of this work is to propose a new technique for automatic fault tracking from 3D seismic data using the 2D Continuous Wavelet Transform (CWT) method combined with artificial intelligence. Time slices of the variance attribute, derived from the 3D seismic data and chosen by the user, are analysed using the 2D CWT with the 2D Mexican Hat as an analysing wavelet, and the maxima of the modulus of the 2D CWT are mapped for the full range of scales. The ensemble of mapped maxima for the set of time slices is filtered using a Convolutional Neural Network machine. Machine training is performed with a supervised mode using the manually tracked faults as a desired output. Application to real data shows the efficiency and robustness of the proposed method, which can greatly help seismic interpreters in avoiding manual fault tracking, a difficult and time-consuming task.

How to cite: Aliouane, L. and Ouadfeul, S.-A.: Automatic fault tracking from 3D seismic data using the 2D Continuous Wavelet Transform combined with a Convolutional Neural Network, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1908, https://doi.org/10.5194/egusphere-egu26-1908, 2026.

EGU26-3574 | Posters on site | ESSI1.4

Deep Learning for Trustworthy Prediction of Cyanobacterial Blooms across CONUS Inland Waters 

Nasrin Alamdari, Syed Usama Imtiaz, and Mitra Nasr Azadani

Cyanobacterial harmful algal blooms (cHABs) pose serious concerns to drinking-water safety, aquatic ecosystem health, and recreational water use across the globe. cHABs in situ data collection relies on sparse and irregular measurements and hinders the reliable learning of complex ecological processes. Although recently data-driven models have improved bloom prediction skills, the explicit reliance on the black-box nature of these models undermines scientific trust and restricts the actionable value of model outputs. In this study, we developed a mechanism-aligned deep learning framework that embeds ecological process structure directly into the learning architecture using diverse data sources. We incorporated detailed remote-sensing-based atmospheric and environmental variables, including aerosol derived deposition, nutrient wet deposition, and meteorological data. We temporally aggregated this data to reflect both short-term forcing and cumulative conditions over space and time. We evaluated our framework for 2,200 lakes across the continental United States from 2018 - 2023, with a one-week-ahead bloom prediction task. Our model is trained on 2018–2021, validated in 2022, and tested on 2023 dataset. Our preliminary results show stable generalization under diverse spatiotemporal domain shifts (R2 = 0.54, RMSE 0.59) with reduced seasonal bias relative to conventional deep learning baselines. In addition to predictive accuracy, our architecture demonstrates high explanation faithfulness (OTA = 0.83) and positive alignment with independent physical proxies (auxiliary physical proxies, R2 = 0.36). This further demonstrates that architecture learned representations remain physically consistent despite the absence of direct mechanism labels. Our work advances a new paradigm for trustworthy environmental predictions and provides a novel foundation for actionable bloom management and policy decision support in data-limited inland water systems.

How to cite: Alamdari, N., Imtiaz, S. U., and Nasr Azadani, M.: Deep Learning for Trustworthy Prediction of Cyanobacterial Blooms across CONUS Inland Waters, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3574, https://doi.org/10.5194/egusphere-egu26-3574, 2026.

EGU26-4683 | ECS | Posters on site | ESSI1.4

SinFusion-based Geological Model Augmentation and Well Data Integration 

Eunsil Park, Hong Lee, Junil Yoon, and Honggeun Jo

Stratigraphic forward modeling (SFM) is a geological modeling framework that simulates depositional processes in sedimentary systems (e.g., deep water, fluvial, delta), enabling the generation of stratigraphic architectures and reservoir property distributions. This approach is particularly effective in reproducing the realistic non-stationarity and geological heterogeneity of deep-water reservoirs, which are difficult to capture using conventional geostatistical methods such as two-point and multipoint statistics. However, SFM results are highly sensitive to small variations in initial geological input parameters, making the integration of observational data such as well logs and seismic data challenging and thereby limiting its application at an industrial scale.

In this study, we propose a novel geological model characterization framework that combines SFM with a generative artificial intelligence approach capable of achieving both high generation efficiency and robust geological realism (Fig. 1). First, an SFM-based geological model is constructed and then preprocessed to make it suitable for neural network training. A single-image diffusion model, SinFusion, is then applied to learn the geometric and property distributions of the geological model and to enable multiple equivalent generations. Furthermore, a well data integration strategy is developed using the aforementioned trained SinFusion. By infusing well data during the reverse diffusion process, the proposed method allows seamless conditioning on well data regardless of the number or spatial locations of wells. This enables immediate model updates when new well data become available in the field with no further need for costly model retraining, ensuring high flexibility.

The validity of the proposed method is evaluated through quantitative comparisons of spatial continuity, property distributions, and geometric pattern similarity with the original SFM model. The results demonstrate that the proposed method can efficiently generate multiple geological realizations and is well suited for ensemble-based uncertainty assessment. Moreover, the proposed method has the potential to expand the applicability of SFM toward industrial-scale geological modeling workflows.

Fig. 1. Overview of the SinFusion framework for geological model augmentation and well data integration based on stratigraphic forward modeling

This work was supported by Korea Gas Corporation (RD2025-0071).

 

How to cite: Park, E., Lee, H., Yoon, J., and Jo, H.: SinFusion-based Geological Model Augmentation and Well Data Integration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4683, https://doi.org/10.5194/egusphere-egu26-4683, 2026.

Development of effective urban climate adaptation and mitigation strategies requires comprehensive spatial information of rooftops and buildings. The aforementioned information is important for assessing the ecosystem services provided by green and blue infrastructure in urban areas, especially for urban heat island (UHI) mitigation and energy conservation. While green roofs are widely acknowledged as a promising solution for enhancing thermal comfort in urban climate, most existing research tends to focus either on mapping current green rooftops or the potential rooftops to implement green rooftops.

This study presents a modified deep convolutional neural network-based rooftop classification framework, based on the Roofpedia framework originally created by the Urban Analytics Lab at the National University of Singapore (NUS). The model leverages high-resolution aerial imagery and incorporates slope of the rooftop to assess green roof suitability. The proposed model uses publicly available geospatial datasets from Swisstopo such as aerial images from SwissImage dataset, elevation data from the swissALTI3D digital terrain model, and building footprints from the swissTLM3D vector dataset.

When the study applies the implemented model to Bern, Switzerland, the model provides the output with labelling the rooftops into four categories: (1) existing green roofs, (2) rooftops suitable for green roof installation, (3) rooftops with solar panels, and (4) flat rooftops which are unsuitable for roof greening. To improve the accuracy and practicality of the classification, roof slope thresholds derived from terrain model were integrated alongside spectral analysis to reflect real-world installation conditions.

The model demonstrated high predictive performance with training loss of 0.0134, mean Intersection over Union (mIoU) of 0.908, and Matthews Correlation Coefficient (MCC) of 0.901. Validation metric demonstrated the robustness with validation loss of 0.0292, mIoU of 0.843, and MCC of 0.822. Comparison with the original Roofpedia framework, the modified model shows significant improvements in multi-class rooftop classification, particularly in identifying realistic opportunities for green roof expansion.

The inclusion of potential green rooftop class, combined with slope-based constraint, allows for a practical and realistic assessment of rooftop suitability for green roof installation. The modified Roofpedia model assists urban planners and decision makers with evidence-based information to support future green infrastructure deployment in Bern and other Swiss cities. Furthermore, the proposed framework is transferable and can be readily replicated in cities worldwide.

How to cite: Ko Ko, H. Y. and Rast, M.: From Rooftops to Ecosystem Services: Deep Learning–Driven Green Roof Potential Assessment in Bern, Switzerland, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4820, https://doi.org/10.5194/egusphere-egu26-4820, 2026.

EGU26-5616 | ECS | Orals | ESSI1.4

A Deep Learning and Ensemble Decision Method to Identify the Urban–Rural Fringe: A Case Study in Kunming City, China 

Yuming Zhu, Tao Hu, Xiaoyu Li, Dahao Zhang, and Jian Peng

The urban-rural fringe (URF) has become the most dynamic area of land use transition and urban-rural factor flows during urbanization, forming a critical focus of sustainable land management. However, existing identification methods have not adequately captured the fine-scale textures and ambiguous transitional boundaries characterizing the URF. Taking Kunming City as the study region, this study developed a lightweight convolutional neural network (UF-Net) to extract spatial textures and boundary features, integrating it with eXtreme Gradient Boosting to construct a hybrid recognition framework. Multisource remote sensing and geospatial datasets were employed to delineate the URF from 2013 to 2023, and stage-specific driving mechanisms were examined using propensity score matching and binary logit models. The results showed that our framework achieved an overall accuracy of approximately 94% for both periods. Over the decade, built-up areas expanded markedly, and the spatial structure evolved from a single-core pattern characterized by fragmented peripheral development to a polycentric configuration with increasingly continuous URF zones. Chenggong and southern Guandu District emerged as major growth frontiers, while URF morphology shifted from linear to ring-shaped and cluster-type forms. Furthermore, the drivers of urban expansion transitioned from dominance by natural terrain and ecological suitability to a regime shaped primarily by human activities and transport accessibility. The proposed hybrid recognition framework, integrating deep feature extraction with ensemble-based classification, establishes a generalizable methodological path for interpreting URF evolution, providing analytical support for optimizing urban spatial structure and sustainable development strategies.

How to cite: Zhu, Y., Hu, T., Li, X., Zhang, D., and Peng, J.: A Deep Learning and Ensemble Decision Method to Identify the Urban–Rural Fringe: A Case Study in Kunming City, China, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5616, https://doi.org/10.5194/egusphere-egu26-5616, 2026.

EGU26-7605 | ECS | Posters on site | ESSI1.4

Disentanglement of Structure and Texture Representations as a Method of Self-Supervision for Earth Observation Data: A Case Study on Cloud Type 

Mikolaj Czerkawski, Alistair Francis, Paul Borne--Pons, Barbara Bertozzi, and Jacqueline Campbell

Self-supervised learning has become a prominent technique for representation learning in Earth observation, largely due to the vast volumes of unlabelled data available in observation archives. However, apart from masked auto-encoding (MAE) techniques and contrastive learning, the diversity of geospatial self-supervised learning schemes in the existing literature remains limited.

This work explores the task of structure and texture disentanglement as an alternative route to self-supervised learning in the domain of Earth observation. Inspired by the Swapping Autoencoder architecture, this pipeline involves an encoder tailored to extract disentangled textural and structural information from an image and reconstruct it back to the image domain. Crucially, it includes an augmentation step that swaps texture and structure embeddings from different samples. This synthetic generation is driven by adversarial training, employing two discriminators: one responsible for assessing the likelihood of the image as a whole being real, and the other for assessing whether individual patches in the image are consistent with the source texture vector.

The texture embedding extracted from the image acts as a global vector describing the aggregated statistics of local features, while the structure embedding represents how these features are distributed in space. This preliminary work explores the potential of this approach on a domain where image labels are particularly scarce: cloud formation types in high-resolution optical imagery. The pipeline is tested on a large collection of cloudy Sentinel-2 images with the goal of identifying observational clusters of cloud formations that share similar properties, as part of the Clouds Decoded project. This work introduces a foundational architecture for this framework along with several methods of analysis that leverage the resulting deep neural network.

How to cite: Czerkawski, M., Francis, A., Borne--Pons, P., Bertozzi, B., and Campbell, J.: Disentanglement of Structure and Texture Representations as a Method of Self-Supervision for Earth Observation Data: A Case Study on Cloud Type, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7605, https://doi.org/10.5194/egusphere-egu26-7605, 2026.

EGU26-7703 | ECS | Posters on site | ESSI1.4

UniRT: A Unified Framework for Time-Series Remote Sensing Image Reconstruction and Change Detection 

Haiyan Huang, Zhenfeng Shao, Chen Zhong, Duowang Zhu, and Wenlan Zhang

Remote sensing time series monitoring plays a vital role in capturing the dynamic evolution of the Earth’s surface. Recent deep learning based temporal change detection (TCD) methods have achieved remarkable progress under cloud-free optical image sequences. However, optical imagery is frequently affected by clouds and cloud shadows, resulting in pervasive and irregular data gaps that disrupt temporal continuity and sampling regularity. Consequently, current TCD approaches struggle to cope with highly dynamic surfaces and long-term or irregularly missing observations, often leading to inaccurate change detection results. To address these challenges, we propose UniRT, a unified framework that jointly performs time-series reconstruction and change detection, enabling robust monitoring from image sequences with missing observations. Specifically, a temporal-adaptive module is seamlessly embedded into a spatiotemporal learning framework while maintaining a lightweight architectural design. In addition, a time-aware decoder is introduced to better capture temporal dependencies and enhance robustness and generalization capability under irregular sampling conditions. Extensive experiments conducted on DynamicEarthNet and SpaceNet7 demonstrate that UniRT consistently outperforms state-of-the-art methods in temporal change detection, particularly in challenging scenarios characterized by severe data gaps and highly dynamic surface changes.

How to cite: Huang, H., Shao, Z., Zhong, C., Zhu, D., and Zhang, W.: UniRT: A Unified Framework for Time-Series Remote Sensing Image Reconstruction and Change Detection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7703, https://doi.org/10.5194/egusphere-egu26-7703, 2026.

The research presented in this study addresses the subject of large-scale soil type classification. It is based on multispectral data from the Sentinel-2 satellite along with recent advances in deep learning for tabular data analysis. Initially we created a soil dataset aligned with the World Reference Base (WRB) classification system. This dataset was created by integrating Sentinel-2 spectral bands with different indices regarding vegetation, exposed soil conditions, mineralogical composition, and moisture dynamics. The study assesses the performance of different classification models, and some hybrid approaches using ensemble learning techniques. We applied and assessed several techniques for data balancing and augmentation to address the uneven class distribution that often exists in soil datasets. The results show that combining multispectral satellite features with specific spectral indices and various learning methods offers an effective and scalable way to generate WRB-consistent soil maps from Sentinel-2 data.

Acknowledgment:

This work is supported by the project "Romanian Hub for Artificial Intelligence-HRIA", Smart Growth, Digitization and Financial Instruments Program, MySMIS no. 351416.

How to cite: Bacu, V.: Deep Learning–Based Soil Classification from Sentinel-2 Multispectral Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8216, https://doi.org/10.5194/egusphere-egu26-8216, 2026.

EGU26-9172 | ECS | Orals | ESSI1.4

A Deep Learning and Physics-Based Multi-Satellite Fusion Framework for Spatiotemporal Super-Resolution Image Generation 

Dohee Han, Seokjin Hahn, Youngryel Ryu, Seungtaek Jeong, Jongsung Ha, and Jongmin Yeom

Although numerous satellites have been developed and launched recently, low Earth orbit satellites offer high spatial resolution but long revisit cycles, resulting in low temporal resolution, whereas geostationary satellites offer high temporal resolution but low spatial resolution. As a result, there are still limitations in reliably acquiring satellite images with high spatiotemporal resolution. To overcome these limitations, research on super-resolution fusion using various satellite images is underway. However, challenges such as data loss due to clouds, differences in revisit cycles between satellites, and sensor characteristic mismatches make it difficult to produce fused super-resolution images. Therefore, this study proposes a multi-satellite-based fusion framework that addresses these issues and reliably generates spatiotemporal super-resolution fusion images.

To this end, this study utilized various satellite images, including GK2A (high temporal frequency, 2km resolution), MODIS (high temporal frequency, 500m resolution), GOCI-II (high temporal frequency, 250m resolution), Landsat-8 (30m resolution), Sentinel-2 (10m resolution), PlanetScope (2.8m resolution), and KOMPSAT-3(3m resolution). Each satellite image underwent preprocessing steps, including geometric correction, radiometric correction, BRDF (Bidirectional Reflectance Distribution Function) correction, and normalization, to ensure spatial alignment and radiometric consistency.

Subsequently, a deep learning model based on DeepLabV3+ ResNet101 was used to generate cloud mask label data, creating mask labels for clouds and missing areas in the video. These labels were then used to apply a gap-filling technique to fill in the cloud and missing regions. Finally, a step-by-step resolution enhancement image fusion method based on spatial resolution was employed to produce a spatiotemporal super-resolution fused image.

The final super-resolution fused image will be validated using spectral data collected from a ground observation tower located in Naju, Jeollanam-do, South Korea. The multi-satellite fusion framework proposed in this study can efficiently overcome the limitations of spatiotemporal resolution by utilizing deep learning and physics-based models during various processing stages. The fusion results are expected to be applicable in various remote sensing fields, such as detecting climate change and environmental variations.

Acknowledgements: This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT)(RS-2025-00515357).

How to cite: Han, D., Hahn, S., Ryu, Y., Jeong, S., Ha, J., and Yeom, J.: A Deep Learning and Physics-Based Multi-Satellite Fusion Framework for Spatiotemporal Super-Resolution Image Generation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9172, https://doi.org/10.5194/egusphere-egu26-9172, 2026.

EGU26-9267 | Orals | ESSI1.4

Predictive Modelling of Landslide Hotspots in Nainital using MT-InSAR and Deep Learning 

Anurag Basu, Onkar Dikshit, and Ashutosh Tiwari

Mountainous regions of the Kumaon Himalayas are particularly prone to landslides due to steep terrain, weak geological conditions, intense monsoon rainfall, and increasing human activity. In such environments, continuous ground-based monitoring is often difficult because of poor accessibility, dense vegetation cover, and frequent cloud conditions. Microwave remote sensing, especially satellite-based Synthetic Aperture Radar (SAR), offers a reliable, weather-independent means of monitoring surface deformation over large areas and long time periods.

This study applies an integrated multi-temporal InSAR (MT-InSAR) and deep learning framework to investigate surface deformation and landslide activity in the Nainital region, Kumaon Himalayas, India. Sentinel-1A SAR data (2020–2025) were processed on the ASF Vertex HyP3 cloud platform using the GAMMA Small Baseline Subset (SBAS) processing chain. The cloud-based workflow automates key interferometric steps, enabling efficient processing of multi-year SAR archives without the need for local high-performance computing facilities.

Time-series inversion and analysis were carried out using MintPy in a GPU-enabled OpenSARLab environment. Weighted least-squares inversion was applied to generate line-of-sight (LOS) deformation time-series and mean LOS velocity maps. In addition, ENU decomposition was performed, and the vertical (Up) component was used for subsequent analysis. The resulting five-year deformation record highlights marked spatial variability across the study area, reflecting deformation associated with slow-moving landslides, slope creep, and other forms of localized instability.

To focus on actively deforming areas, pixels were objectively selected using Otsu thresholding applied to long-term displacement metrics derived from the MT-InSAR time-series. This data-driven approach allowed stable and deforming areas to be separated without relying on subjective thresholds, capturing both known unstable slopes and newly emerging deformation zones. The selected high-deformation pixels were then used for short-term deformation forecasting using two models: a Long Short-Term Memory (LSTM) network and a Temporal Convolutional Network (TCN).

Model performance was assessed using five-fold time-series cross-validation. The TCN model showed consistently better performance, achieving an R² of ~0.95 and an F1-score of ~0.97, compared to the LSTM model (R² ~0.93, F1 ~0.92), indicating improved representation of long-range temporal dependencies and non-linear deformation behaviour.

To examine the spatial evolution of instability, K-means clustering was applied to both five-year historical and twelve-month forecasted displacement time-series, producing deformation cluster maps for each period. Areas showing transitions from lower to higher deformation classes were identified as emerging landslide hotspots. The observed deformation patterns and hotspot distribution show strong agreement with previous MT-InSAR-based landslide studies and regional landslide inventories from the Himalayan region, providing independent validation of the proposed framework for landslide hazard assessment and risk management.

How to cite: Basu, A., Dikshit, O., and Tiwari, A.: Predictive Modelling of Landslide Hotspots in Nainital using MT-InSAR and Deep Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9267, https://doi.org/10.5194/egusphere-egu26-9267, 2026.

EGU26-10211 | Orals | ESSI1.4

Improving Generalization of Deep Learning–Based Ring Artefact Removal in X-ray Microtomography Imaging of Geomaterials 

Syadhisy Dhanapal, Benoit Cordonnier, and Francois Renard

High-resolution X-ray microtomography (XMT) imaging of rock deformation experiments at micrometer scale provide valuable insights into the coupled evolution of pores, cracks, and fluid pathways (Noiriel & Renard, 2022). A critical step in XMT data processing is the removal of ring artefacts, which is attributed to malfunctioning detector components within the acquisition system (Vo et al., 2018). These artefacts appear as stripes in the raw acquired sinogram domain and concentric circles in reconstructed images. Ring artefacts can adversely affect downstream analyses such as pore and fracture segmentation, and digital volume correlation (DVC) (Mahdaviara et al., 2025). Advances in GPU computing and ML-based image processing have led the synchrotron community to explore deep learning architectures including ResUNET (Fu et al., 2023), and attention-based variants (Zhang et al., 2022) to suppress ring artefacts. Most ML-based denoisers rely on single-domain, pixel-based loss functions, such as L1, L2, or structural similarity index measure (SSIM), applied either in the sinogram or reconstructed image domain.

This study investigates a dual-domain loss function that combines loss terms in the sinogram domain with those in the corresponding Fast Fourier Transform magnitude (FFT amplitude) domain, aiming to improve generalization of trained U-Net variants. Existing artefact-free XMT images of basalt were used to simulate stripe artefacts in its raw sinogram domain. Stripe artefact generation was controlled using three parameters: pixel thickness, amplitude, and number of stripes per sinogram. A total of 5,000 paired noisy and clean sinograms were generated and split into training, validation, and test datasets. Three UNET-based architectures were evaluated: a baseline U-Net (baseUNET), a residual U-Net (ResUNET), and a residual U-Net with attention gates (AG-ResUNET). Models were trained for 100 epochs using the Adam optimiser and their performances were assessed using peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and qualitative inspection of sinograms, reconstructed images, and FFT amplitude spectra. The study compares single- and dual-domain loss functions in terms of ring artefact suppression and generalization beyond the training data distribution, discusses limitations related to the dynamic range of the training data and its implications for denoising experimental XMT datasets.

References

  • Fu, T., Wang, Y., Zhang, K., Zhang, J., Wang, S., Huang, W., Wang, Y., Yao, C., Zhou, C., & Qin, Y. (2023). Deep-learning-based ring artifact correction for tomographic reconstruction. Journal of Synchrotron Radiation, 30(3).
  • Mahdaviara, M., Mousavi, M., Rafiei, Y., Raoof, A., & Sharifi, M. (2025). Improving numerical fluid flow simulation by ring artifact removal in micro-CT images of porous media using attention autoencoder–decoders. Transport in Porous Media, 152, 57.
  • Noiriel, C., & Renard, F. (2022). Four-dimensional X-ray micro-tomography imaging of dynamic processes in geosciences. Comptes Rendus Géoscience, 354(G2), 255–280. https://doi.org/10.5802/crgeos.137
  • Vo, N. T., Atwood, R. C., & Drakopoulos, M. (2018). Superior techniques for eliminating ring artifacts in X-ray micro-tomography. Optics Express, 26(22), 28396. https://doi.org/10.1364/oe.26.028396
  • Zhang, J., Niu, Y., Shangguan, Z., Gong, W., & Cheng, Y. (2022). A novel denoising method for CT images based on U-net and multi-attention. Computers in Biology and Medicine, 152, 106387. https://doi.org/10.1016/j.compbiomed.2022.106387

How to cite: Dhanapal, S., Cordonnier, B., and Renard, F.: Improving Generalization of Deep Learning–Based Ring Artefact Removal in X-ray Microtomography Imaging of Geomaterials, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10211, https://doi.org/10.5194/egusphere-egu26-10211, 2026.

EGU26-10786 | Posters on site | ESSI1.4

PAFM-GAN: Physics-Aware and Frequency-Regularized GAN for SAR-to-Optical Image Translation 

Linxin Wang, Yao Liu, Jinqi Zhao, and Zhong Lu

SAR-to-optical image translation (S2OIT) aims to transform the complex backscattering characteristics of Synthetic Aperture Radar (SAR) into more interpretable optical appearances. However, existing methods often suffer from over-smoothed structural details, generation of pseudo-textures caused by inconsistencies between generated textures and real optical images, and insufficient global consistency in complex scenes. To address these challenges, we propose a Physics-Aware and Frequency-Regularized Generative Adversarial Network (PAFM-GAN) for SAR-to-optical translation. Specifically, we extract local statistical and edge structural cues from SAR images and inject them into the generator as additional guidance, which enhances structural authenticity and mitigates the impact of speckle noise. To mitigate spectral misalignment and suppress high-frequency artifacts, we further transform both the generated and real optical images into the Fourier frequency domain and perform spectral distribution alignment between them. We also introduce a frequency-domain discriminator to suppress unrealistic high-frequency components, thereby effectively reducing spurious details in the synthesized results. In addition, to capture long-range dependencies under high-resolution scenarios with low computational overhead, we integrate a Mamba-based state space module (SSM) into the generator for efficient global context modeling, improving scene-level style coherence and overall consistency. Extensive experiments on the SAR2Opt, SEN1-2, and QXS-SAROPT demonstrate that PAFM-GAN consistently outperforms representative SAR-to-optical baselines across five metrics, including PSNR, SSIM, FID, LPIPS, and FSIMc.  In addition, the results of multiple ablation experiments validate the effectiveness of the proposed method.

How to cite: Wang, L., Liu, Y., Zhao, J., and Lu, Z.: PAFM-GAN: Physics-Aware and Frequency-Regularized GAN for SAR-to-Optical Image Translation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10786, https://doi.org/10.5194/egusphere-egu26-10786, 2026.

EGU26-11947 | ECS | Posters on site | ESSI1.4

Automatic Detection and Segmentation of Methane Plumes in GHGSat Imagery  

Frédéric Piedboeuf, Marianne Girard, Dylan Jervis, Jason McKeever, and Joshua Sampson

GHGSat currently operates 14 methane satellites and has plans to expand the constellation further, acquiring images globally of facilities that could emit methane for monitoring and mitigation. The constellation produces almost 1,000 observations per day in which methane is detected, geolocated and quantified. It is impractical to rely on human inspection alone at such large scale, and so automated solutions are required. However, automation must handle a highly complex classification problem—distinguishing small methane plumes from retrieval artifacts—while operating reliably at very high throughput. 

Two common types of automation that help human operators are using machine learning models to detect methane and to propose segmentation masks. The first one helps reduce the total amount of data seen by human operators, and the second helps reduce the operator time spent per observation. While these types of automation are common in methane detection with coarse-resolution public satellites such as Sentinel-2 or EMIT, their applications to fine spectral and spatial resolution satellites have been more limited.  

To handle the growing amount of data, we develop transformer-based detection and segmentation models, which can assist operators in processing the observations. We present the models used and performance achieved in terms of precision and recall, both for detection and segmentation, as well as discuss future improvements to further diminish operator time.  

How to cite: Piedboeuf, F., Girard, M., Jervis, D., McKeever, J., and Sampson, J.: Automatic Detection and Segmentation of Methane Plumes in GHGSat Imagery , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11947, https://doi.org/10.5194/egusphere-egu26-11947, 2026.

EGU26-12575 | Posters on site | ESSI1.4

Stage-wise ConvLSTM Sequential Transfer Learning for Hierarchical Crop Type Mapping in Senegal’s Smallholder Systems 

Kidia K. Gelaye, Mamadou Adama Sarr, Murali Krishna Gumma, Pierre C. Sibiry Traore, Cyrille B.E. Bassene, Fama Mbemgue, and Janet M. Mutuku

Earth Observation data (EO) can support food-security decision making in Sub-Saharan Africa, yet operational crop-type mapping in dryland smallholder systems remains challenging under rainfed-season cloud cover, heterogeneous cropping calendars, and small, irregular fields. Model performance is further degraded by scarce and noisy labels, mixed-cropping and intercropping practices, and strong domain shift across agro-ecologies. A particularly consequential failure mode is confusion between cropped and fallow parcels, where vegetated fallows can mimic crop spectral–temporal signatures and bias cropland statistics and downstream indicators used for early warning, input targeting, and program planning. A hierarchical, stage-wise sequential transfer-learning framework built on Convolutional Recurrent Neural Networks (ConvRNNs/ConvLSTMs) to improve robustness in data-scarce smallholder landscapes is proposed. The approach learns reusable spatiotemporal representations in a coarse-to-fine curriculum and transfers them across tasks of increasing label granularity. Stage 1 produces a cropland mask by classifying cropland versus other land uses (explicitly including fallow), targeting the crop–fallow confusion that dominates errors in dryland settings. Stage 2 refines cropland into agronomic family groups (e.g., cereals, legumes, vegetables), preserving interpretable subclass structure that is often sufficient for operational monitoring when fine labels are sparse. Stage 3 resolves fine-grained crop types and mixed-dominant intercropping states. The ConvLSTM backbone is trained stage-wise: parameters learned at a coarser stage initialize the next stage, while stage-specific classification heads are optimized for the current hierarchy level. The framework is demonstrated in Senegal using Planet NICFI monthly composites (~5 m; RGB+NIR) and in situ polygon labels collected during the 2020 and 2023 rainfed seasons. Training samples are built as ~0.5 ha image patches (14×14 pixels) extracted from interior points within polygons, with sampling density scaled by polygon area to better represent large fields while maintaining coverage of small parcels. The dataset, 6,978 labeled polygons in 2020 and 5,827 in 2023 generate 18,380 and 18,378 patches for September and October 2020 (no August imagery), and 13,733/13,623/13,524 patches for August/September/October 2023. To address severe long-tail imbalance typical of regional crop inventories, offline quota-based corpus curation, online weighted sampling, and consolidate ultra-rare fine-grained labels into an “OTHER” class at Stage 3 to stabilize training, are combined. The staged framework is benchmarked against machine and deep learning baselines (Random Forest, XGBoost, CNN, and single-stage recurrent models) using macro-averaged metrics and precision–recall behavior, selecting operating points that favor higher precision for operational mapping. Results show robust cropland maps with stable accuracy under limited labels and small, irregular fields, while preserving subclass structure; cereals and legumes remain identifiable at Stage 2 (validation accuracy ≈ 0.59). At Stage 3, precision is highest for major crops, Groundnut 0.83, Millet 0.72, and Maize 0.69, and moderate for Cowpea 0.51 and Rice 0.42. Remaining errors are primarily driven by data imbalance, mixed-cropping systems, and spectral confusion, highlighting priority areas for improving long-tail supervision and intercropping representation.

How to cite: Gelaye, K. K., Sarr, M. A., Gumma, M. K., Traore, P. C. S., Bassene, C. B. E., Mbemgue, F., and Mutuku, J. M.: Stage-wise ConvLSTM Sequential Transfer Learning for Hierarchical Crop Type Mapping in Senegal’s Smallholder Systems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12575, https://doi.org/10.5194/egusphere-egu26-12575, 2026.

EGU26-14810 | ECS | Posters on site | ESSI1.4

From conventional machine learning to Transformers: multi-model hindcasting of GRACE terrestrial water storage anomalies (TWSA) with multi-source validation 

Wasim Karam, Kivilcim Yüksel, Abdurrahman Gümüş, and Orhan Gündüz

GRACE and GRACE-FO satellite missions offer an observation-based perspective on terrestrial water storage anomalies (TWSA), which is valuable for assessing climate variability and human influence on large-scale water resources. In practice, however, the short duration of the GRACE record and its coarse spatial resolution make it difficult to build long, spatially consistent storage information that are needed to study basin-scale responses to hydrologic extremes such as droughts and floods. To address this limitation, we develop a multi-model hindcasting framework that reconstructs monthly GRACE TWSA from hydro-climatic predictors and evaluates both predictive performance and hydrologic plausibility using independent evidence related to extremes.

We compare four models representing three methodological families: (i) tree-based machine learning (Extreme Gradient Boosting and Extra Trees Regressor), (ii) spatio-temporal deep learning (Convolutional LSTM), and (iii) an efficient Transformer architecture for long-sequence forecasting (Informer). All models are trained and tested over the observational GRACE period (2002–2025) using strict time-block splits, and then applied to hindcast historical monthly TWSA. Predictor co-variables include precipitation, evapotranspiration, temperature, soil moisture, and land data assimilation–based storage components from GLDAS products (Noah and CLSM), enabling the models to learn storage persistence and hydro-climatic controls beyond what can be inferred from GRACE alone. The performance of the models is assessed using standard error and agreement metrics (RMSE and correlation) as well as hydrologically oriented measures (Nash–Sutcliffe Efficiency and Kling–Gupta Efficiency), with additional diagnostics targeting the representation of seasonal and inter-annual variability.

Transformer model shows the best performance on testing periods with R2 of 0.81, CC of 0.89, NSE of 0.86 and KGE of 0.88, and RMSE of 61.3 mm, while ETR showed the least R2 of 0.713, CC of 0.76, NSE of 0.68 and KGE of 0.73. To move beyond statistical agreement with GRACE, we evaluate physical reliability through multi-source validation across all major sub-basins of Türkiye using (1) groundwater table observations and (2) independent flood information, including a flood potential indicator and mapped flood extents where available. All model families capture key groundwater storage variability, while Informer generally provides the highest predictive skill and better preserves persistence and the seasonal cycle than ConvLSTM and the tabular learners. Periods of elevated reconstructed storage are consistently associated with historical record of higher flood potential, while the lower extremes of TWSA record identifying historical droughts, supporting the hydrologic realism of the hindcast products. At the same time, the tree-based models—particularly XGBoost—remain attractive due to their low computational cost and, their stronger ability to reproduce observed flood-extent spatial patterns in some basins while maintaining extreme behavior comparable to transformer architecture.

Overall, the inter-comparison highlights practical trade-offs among accuracy, robustness to extremes, and computational efficiency, and provides guidance for scalable GRACE TWSA hindcasting on cloud platforms. The validation approach is transferable and supports the use of reconstructed storage fields for drought–flood assessment and basin-scale water resources analysis.

How to cite: Karam, W., Yüksel, K., Gümüş, A., and Gündüz, O.: From conventional machine learning to Transformers: multi-model hindcasting of GRACE terrestrial water storage anomalies (TWSA) with multi-source validation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14810, https://doi.org/10.5194/egusphere-egu26-14810, 2026.

EGU26-15741 | Orals | ESSI1.4

3D HS orebody delineation integrating CatBoost modelling and Transformer-based evidence fusion: A case study from Čukaru Peki, Serbia 

Gongwen Wang, Guoqing Zhang, Zhongzheng Wang, Shuren Yang, and Yiran Wang

High-sulfidation (HS) orebodies are typically characterised by advanced argillic alteration and strong structural control. However, their 3D delineation remains challenging due to the inherent complexity of alteration facies and the limitations of discrete drillhole observations. We present a comprehensive 3D machine-learning workflow to delineate HS orebodies at the Čukaru Peki deposit (Eastern Serbia) by integrating drill-core SWIR spectroscopy with geological and geochemical constraints.Alteration mineralogy was characterised from SWIR spectra using The Spectral Geologist (TSG), extracting diagnostic sulfate–clay signatures (e.g., alunite-group minerals) and spectral scalars (e.g., ~2.20 μm absorption depth and white-mica crystallinity). To bridge the gap between discrete samples and a continuous volume, we constructed voxel-scale attribute fields using CatBoost regression. Unlike conventional distance-based interpolation, CatBoost learns nonlinear spatial dependencies conditioned on coordinates and geological context (lithology, alteration facies, and fault proximity), enabling data-driven 3D inference across the entire modelling volume.Subsequently, a Transformer encoder was employed for voxel-wise evidence fusion on the stacked 3D attribute layers. The model captures the nonlinear mapping of "multi-evidence interaction → HS mineralisation probability" to output a probabilistic targeting volume. The model was trained on labelled exploration drilling data (604 samples) and rigorously validated against an independent in-mine dataset (2,850 samples). Performance evaluation using confusion matrices and ROC curves consistently suggests that sulfur enrichment, alteration intensity, and structural proximity jointly govern HS distribution. This approach provides a robust, interpretable basis for 3D orebody modelling and drill targeting in complex porphyry–epithermal systems.

How to cite: Wang, G., Zhang, G., Wang, Z., Yang, S., and Wang, Y.: 3D HS orebody delineation integrating CatBoost modelling and Transformer-based evidence fusion: A case study from Čukaru Peki, Serbia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15741, https://doi.org/10.5194/egusphere-egu26-15741, 2026.

EGU26-16478 | Orals | ESSI1.4

GLSNet: State-Space-Enhanced Open-Pit Mine Detection With Global-Local Information Fusion 

Zhangjie Chen, Yi Zheng, Dai Yao, and Jinqi Zhao

Abstract:Open-pit mines, typical land-surface features shaped by intensive human activities, require rapid identification of their spatial distribution for effective mineral resource supervision, ecological disturbance assessment, and land inspection. Optical remote sensing imagery, with its wide coverage, convenient acquisition, and rich spatial details and textures, provides intuitive morphological and contextual cues for open-pit mine identification and is therefore widely employed in routine monitoring and rapid assessment. Nevertheless, open-pit mines often bear strong visual similarities to quarries, bare land, construction-disturbed zones, and waste dumps. Meanwhile, slender structures (e.g., pit boundaries, bench slopes, and haul roads) tend to be smoothed out in multi-scale representations, which makes it challenging to balance global shape characterization with precise local boundary localization. To address these issues, we propose GLSNet (Global-Local State-space Network) , a feature-enhancement framework for open-pit mine detection, consisting of three synergistic modules. First, an Adaptive Scale-aware Spatial Pyramid Pooling Fast (A-SPPF) module is introduced to adaptively select effective contextual ranges, suppress confusing background interference, and improve scale robustness. Second, a Low-resolution State-Space Modeling (LS-SSM) module is designed to efficiently model long-range dependencies and scene structural relationships, enhancing discrimination between open-pit mines and visually similar land-surface units. Third, a Scale-adaptive Global–Local Fusion (SGF) module is proposed to jointly strengthen global structural constraints and local boundary details, thereby balancing holistic morphology representation and key boundary localization, and improving detection stability and cross-region generalization.We evaluate our method on the public Open Pit Mine Object Detection Dataset and compare it with Faster R-CNN, YOLOv5, YOLOv8, YOLOv10, RTMDet, RT-DETR, DEIM, and Mamba-YOLO. Results demonstrate that GLSNet achieves superior overall detection performance, with particularly notable advantages in resisting background-induced confusion under complex conditions and in recognizing small-scale targets, while maintaining high inference efficiency, thereby validating the effectiveness and synergy of the proposed modules.

Keywords:open-pit mine detection; state-space models (SSM); multi-scale features; global-local fusion.

How to cite: Chen, Z., Zheng, Y., Yao, D., and Zhao, J.: GLSNet: State-Space-Enhanced Open-Pit Mine Detection With Global-Local Information Fusion, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16478, https://doi.org/10.5194/egusphere-egu26-16478, 2026.

EGU26-16765 | ECS | Orals | ESSI1.4

Basis Functions Representation for Deep Learning Models 

Brian O'Sullivan and Barry Coonan

Machine learning has seen widespread adoption across the geosciences. In particular, deep learning methods have proven effective for producing gridded datasets of climate parameters. Convolutional neural networks are commonly used, but their performance can be limited by the availability and structure of data, especially for sparse or irregularly sampled climate observations. Graph neural networks can handle irregular spatio-temporal data, but their reliance on local interactions restricts their ability to capture large-scale climate processes.

An alternative approach is DeepKriging, originally proposed by Chen et al., which embeds the spatial domain using basis functions centered at knot points across the region of interest. By using these basis functions as input features for a neural network, DeepKriging provides an efficient and flexible representation of both spatial and temporal domains, making it suitable for irregular data and capable of capturing both large-scale and local effects. However, DeepKriging requires basis functions to be manually defined before model training, which can require extensive work from the practitioner to fine-tune the model. This also limits the model’s ability to adapt to varying spatio-temporal patterns.

Here, we propose several extensions to DeepKriging, primarily by allowing basis functions to be updated throughout model training. The resulting model dynamically adapts to diverse spatio-temporal patterns while converging on a basis function representation that is optimal for the current data. We further improve the flexibility of the spatial embedding through a mesh generated via constrained Delaunay triangulation. This approach is applied to multiple climate variables, including precipitation and wind data for Ireland, demonstrating an improved performance compared with the original DeepKriging as well as several state-of-the-art deep learning and geostatistical gridding methods.

Finally, we also show how basis function representations are particularly well suited for datasets with limited availability, such as sparsely sampled climate parameters like relative humidity or soil moisture. This flexibility can be leveraged across a range of machine learning frameworks, including transfer learning with DeepKriging models or more lightweight algorithms such as Random Forests and XGBoost.

How to cite: O'Sullivan, B. and Coonan, B.: Basis Functions Representation for Deep Learning Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16765, https://doi.org/10.5194/egusphere-egu26-16765, 2026.

EGU26-17458 | ECS | Orals | ESSI1.4

An adaptive window and resolution-aware detection framework for dense small-object mapping from very-high-resolution satellite imagery 

Zeyu Xu, Zijing Wu, Isla Duporge, Stephen Lee, and Tiejun Wang

Detecting wildebeest from very-high-resolution (VHR) satellite imagery enables large-area population monitoring in a single acquisition, avoiding aircraft-induced disturbance and reducing sampling bias caused by transect-based surveys. However, wildebeest appear as extremely small objects in satellite images, and direct application of classical object detectors (e.g., YOLO-style detectors) often yields poor performance. In particular, high-density aggregation areas suffer from severe missed detections due to scale mismatch and limitations in post-processing for densely packed small objects.

To address these challenges, we develop a targeted detection solution that integrates (1) an adaptive sliding-window strategy to better capture local context under varying density conditions, (2) resolution–detector adaptation to mitigate scale mismatch between object size and detector design, and (3) improved post-processing modules, including an enhanced non-maximum suppression (NMS) tailored for dense small-object scenarios. We evaluate the proposed framework using WorldView-2 and WorldView-3 imagery over the Serengeti acquired in 2022 and 2023. The overall F1-score improves from 0.727 to 0.770 in 2022 and from 0.682 to 0.756 in 2023. Notably, in high-density areas in 2022, the F1-score increases from 0.330 to 0.821, demonstrating that our approach effectively reduces missed detections in dense small-object scenarios that commonly lead to substantial omissions in traditional pipelines.

Beyond wildebeest monitoring, our results highlight a generalizable pathway for adapting classical detectors to dense small-object detection in VHR satellite imagery, where objects are tiny and crowded. 

How to cite: Xu, Z., Wu, Z., Duporge, I., Lee, S., and Wang, T.: An adaptive window and resolution-aware detection framework for dense small-object mapping from very-high-resolution satellite imagery, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17458, https://doi.org/10.5194/egusphere-egu26-17458, 2026.

EGU26-17651 | ECS | Orals | ESSI1.4

FSG-Net: Frequency-Spatial Synergistic Gated Network for High-Resolution Remote Sensing Change Detection 

Zhongxiang Xie, Shuangxi Miao, Yuhan Jiang, Zhewei Zhang, Jing Yao, Xuecao Li, Jianxi Huang, and Pedram Ghamisi

Change detection from high-resolution remote sensing images lies as a cornerstone of Earth observation applications, yet its efficacy is often compromised by two critical challenges. First, false alarms are prevalent as models misinterpret radiometric variations from temporal shifts (e.g., illumination, season) as genuine changes. Second, a non-negligible semantic gap between deep abstract features and shallow detail-rich features tends to obstruct their effective fusion, culminating in poorly delineated boundaries. To step further in addressing these issues, we propose the Frequency-Spatial Synergistic Gated Network (FSG-Net), a novel paradigm that aims to systematically disentangle semantic changes from nuisance variations. Specifically, FSG-Net first operates in the frequency domain, where a Discrepancy-Aware Wavelet Interaction Module (DAWIM) adaptively mitigates pseudo-changes by discerningly processing different frequency components. Subsequently, the refined features are enhanced in the spatial domain by a Synergistic Temporal-Spatial Attention Module (STSAM), which amplifies the saliency of genuine change regions. To finally bridge the semantic gap, a Lightweight Gated Fusion Unit (LGFU) leverages high-level semantics to selectively gate and integrate crucial details from shallow layers. Comprehensive experiments on the CDD, GZ-CD, and LEVIR-CD benchmarks validate the superiority of FSG-Net, establishing a new state-of-the-art with F1-scores of 94.16%, 89.51%, and 91.27%, respectively. The code will be made available at https://github.com/zxXie-Air/FSG-Net.

How to cite: Xie, Z., Miao, S., Jiang, Y., Zhang, Z., Yao, J., Li, X., Huang, J., and Ghamisi, P.: FSG-Net: Frequency-Spatial Synergistic Gated Network for High-Resolution Remote Sensing Change Detection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17651, https://doi.org/10.5194/egusphere-egu26-17651, 2026.

EGU26-2220 | ECS | Posters on site | ITS1.6/ESSI1.6

Data-efficient enhanced Pix2Geomodel.v2 for complex facies settings 

Abdulrahman Al-Fakih, Sherif Hanafy, Nabil Saraih, Ardiansyah Koeshidayatullah, and SanLinn Kaka

Reservoir modelling in heterogeneous carbonate systems is often constrained by sparse well control and labor-intensive interpretation, which increases uncertainty when extrapolating between wells. We present an enhanced Pix2Geomodel.v2 workflow that reframes facies and petrophysical modelling as paired image-to-image translation. Facies and petrophysical properties are exported from a reference reservoir model, converted into paired 2D training images, and used to train a Pix2Pix-style conditional generative adversarial network (cGAN). The architecture couples a U-Net generator with a PatchGAN discriminator, enabling the model to learn spatial relationships directly from examples. To reduce data requirements while retaining geological heterogeneity, the workflow operates on a streamlined grid of 54 vertical layers and targets complex facies distributions. Preliminary results show stable training and predictions that reproduce the main geological patterns of the reference data. In facies-to-property translation, the network learns meaningful mappings to porosity, permeability, and volume of shale.

How to cite: Al-Fakih, A., Hanafy, S., Saraih, N., Koeshidayatullah, A., and Kaka, S.: Data-efficient enhanced Pix2Geomodel.v2 for complex facies settings, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2220, https://doi.org/10.5194/egusphere-egu26-2220, 2026.

EGU26-2222 | Posters on site | ITS1.6/ESSI1.6

Bidirectional translation + spatial continuity validation 

SanLinn Kaka, Abdulrahman Al-Fakih, Nabil Saraih, Ardiansyah Koeshidayatullah, and Sherif Hanafy

Capturing cross-property correlations while preserving spatial continuity is essential for reliable reservoir characterization, especially in heterogeneous reservoirs where facies architecture controls petrophysical variability. In this study, we evaluate Pix2Geomodel.v2 as a bidirectional image-to-image translation framework that learns mappings between facies and petrophysical properties using paired 2D slices exported from a reference reservoir model. To reduce data demands while maintaining geological complexity, the workflow operates on a streamlined grid of 54 vertical layers, enabling efficient training and rapid experimentation without removing key stratigraphic and facies patterns. The approach is based on a conditional generative adversarial learning strategy. A U-Net generator is trained to synthesize target facies or property maps from input images, while a PatchGAN discriminator encourages locally realistic textures and geologically plausible transitions. The paired-slice formulation allows the model to learn both large-scale structural organization and fine-scale heterogeneity directly from examples. We investigate two complementary directions: (i) facies-to-property translation, where facies maps are used to predict continuous property fields such as porosity and permeability, and (ii) property-to-facies translation, where petrophysical images are used to reconstruct discrete facies distributions. Beyond conventional forward mapping, the reverse translation experiments are particularly informative because they test whether the model captures meaningful cross-property dependencies rather than superficial patterns. The reconstructed facies maps recover coherent large-scale facies trends and geologically consistent connectivity, indicating that the learned representation encodes relationships between depositional architecture and petrophysical response. Spatial realism is further examined using experimental variograms, providing a continuity-based check that generated outputs qualitatively align with the reference model in terms of spatial correlation structure. Overall, the results suggest a data-efficient route to robust forward and reverse translations that can support faster reservoir model prototyping, property population guided by facies, and consistency checking between facies and petrophysical interpretations.

How to cite: Kaka, S., Al-Fakih, A., Saraih, N., Koeshidayatullah, A., and Hanafy, S.: Bidirectional translation + spatial continuity validation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2222, https://doi.org/10.5194/egusphere-egu26-2222, 2026.

Data generated in the field of geoscience has unique properties that can be characterized by high complexity, sparsity, and site-specific variability. Owing to the unique characteristics, applying general artificial intelligence frameworks and achieving model generalization in geoscience still remains a challenging problem. In this work, we introduce the KIGAM GeoAI Platform, an integrated AI environment designed to bridge the gap between advancing AI technology and practical geoscience research. The platform supports the entire research workflow through a user-friendly, web-based interface, systematically covering essential stages: data uploading, preprocessing, model development, testing, validation, and the final deployment of analytical applications. By providing a centralized online environment for collaborative research, the platform aims to reduce technical entry barriers for geoscientists who may not be AI experts, while establishing a robust foundation for data-driven cooperation. We plan to continue improving and scaling this platform to ensure it remains a stable, accessible, and high-performance tool for both domestic and international geoscience communities. Through these efforts, the KIGAM GeoAI Platform is expected to accelerate digital transformation and foster a more integrated global research ecosystem in the field of geoscience.

How to cite: Kwon, J.:  Innovate with Ease: Introducing the KIGAM GeoAI Platform, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2290, https://doi.org/10.5194/egusphere-egu26-2290, 2026.

EGU26-2380 | ECS | Posters on site | ITS1.6/ESSI1.6

The application of artificial intelligence in fault tracking on 3D seismic data – A case study from Drmno Basin (SE Serbia) 

Anastasia Ninić, Dejan Radivojević, and Dragana Đurić

Artificial intelligence (AI) tools increasingly enhance the efficiency and consistency of seismic interpretation, particularly in structurally complex areas or areas where data quality is reduced by acquisition limitations. As a result, interpretations can become difficult and time-consuming, especially in the context of structural interpretation and fault tracking. To evaluate the performance of AI-based fault detection, we applied Geoplat AI software to a 3D seismic volume from the Drmno Basin, located at the southeastern margin of the Pannonian SuperBasin in Serbia.
A conventional structural interpretation was first performed by mapping the major fault systems, then minor fault systems, generating fault sticks and polygons for all visible faults and developing a structural model to illustrate the basin's opening and evolution. Subsequently, AI-based workflows were applied in order to enhance the quality of the seismic data. This involved removing noise, restoring reflections, highlighting fault zones, and applying smoothing filters. The final step was the utilization of a fault tracking tool that segments the seismic data, recognizes fault zones, traces them, identifies structural patterns, and calculates a probability field. The AI-derived fault interpretation was then compared with the manual interpretation.
The results indicate that the Drmno basin was developed under an extensional tectonic regime during the Early Miocene, which formed a large Morava detachment fault and opened accommodation of the basin. The basin itself has complex architecture in the syn-rift phase, with many synthetic and few antithetic faults, oriented from the east to the west. During the stage of the rift climax, the dominant fault systems remained consistent, with most syn-rift structures continuing to accommodate the subsidence formed by the Morava detachment. The shift in the tectonic conditions in the post-rift stage leads to the formation of systems of parallel faults in the younger sediments, adjusting strike-slip movements in a compressional tectonic field. The younger structures are dominantly oriented in the north-south direction, or reactivated older fault structures.
The AI tool effectively interpreted fault systems in the younger geological units, benefiting from higher data quality, and clearly indicated younger fault systems with a high level of certainty. However, in the lower part of the seismic cube, the basement structures remain unclear or unrecognized. Reactivated fault surfaces and a significant fault zone are evident in the interpretation. In areas with low-quality seismic data, the AI tool struggled to trace faults accurately, resulting in geologically inconsistent fault patterns.
Overall, the AI-based 3D fault tracking tool proved effective in resolving the main structural framework of the basin. The dominant fault directions are clearly identifiable, and the main geological structures have been mapped with reasonable precision. The AI-supported interpretation successfully captures the main structural trends and provides a solid basis for evaluating the tectonic evolution. This case study demonstrates the potential of AI to support structural interpretation and tectonic analysis of complex sedimentary basins.

How to cite: Ninić, A., Radivojević, D., and Đurić, D.: The application of artificial intelligence in fault tracking on 3D seismic data – A case study from Drmno Basin (SE Serbia), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2380, https://doi.org/10.5194/egusphere-egu26-2380, 2026.

EGU26-2819 | ECS | Posters on site | ITS1.6/ESSI1.6

Climate Service Recipes: automatic multi-hazard climate information workflow generation using agentic Large Language Models (LLMs) and knowledge graphs 

Anrijs Abele, Hailun Xie, Arjun Biswas, Hang Dong, Fai Fung, and Hywel Williams

Climate Service Recipes (HACID-CSR) is an agentic system designed to assist providers of climate services in developing their advice for a wide range of clients. HACID-CSR guides providers by navigating the large and ever-increasing corpus of knowledge as well as an area without established standards and with limited access to scientific experts. It automatically generates detailed workflows (or “recipes”) by leveraging both a large language model’s internal reasoning and contextual knowledge from a domain knowledge graph for climate services (CS-DKG). The CS-DKG is an expert-curated ontology of climate service concepts with mapped relationships between climate variables, emission scenarios, indices, hazards, sectors, and key datasets (CORDEX, CMIP5, UKCP18), built as part of the Horizon Europe-funded HACID project (Hybrid Human Artificial Collective Intelligence in Open-Ended Decision Making).

The HACID-CSR architecture consists of a memory-enabled supervisor agent orchestrating multiple specialised agents. A planning agent first proposes an initial workflow outline, and a preliminary recipe agent uses only the LLM’s knowledge to draft answers to key workflow steps. The system then engages a knowledge graph retrieval sequence: a class selection agent identifies relevant classes in the CS-DKG, an instance selection agent finds specific instances (entries) highly relevant to the query within those classes following a two-stage selection process, i.e. semantic similarity based pre-selection and LLM-enabled refined selection, and a subgraph extraction agent retrieves the corresponding subgraph of related knowledge entities. Next, a recipe generation agent creates each step of the workflow by combining the LLM’s reasoning with the retrieved graph context using graph retrieval-augmented generation (GraphRAG). Finally, a recipe refinement agent compares the preliminary LLM-only solution with the knowledge-enhanced solution and refines the output, yielding a diverse and context-aware workflow.

By using this multi-agent approach, HACID-CSR increases the diversity of solutions and fills the knowledge gap between climate information and domain specific applications, helping experts to identify suitable methodologies and datasets. The resulting workflows are more traceable and transparent, improving user trust compared to answers from a general-purpose chatbot. We have also developed a bespoke automatic evaluation method to complement human expert validation of the generated recipes. We highlight the potential of the HACID-CSR approach for multi-hazard climate service design, and discuss remaining challenges and opportunities for further refinement of this agentic LLM-based system.

How to cite: Abele, A., Xie, H., Biswas, A., Dong, H., Fung, F., and Williams, H.: Climate Service Recipes: automatic multi-hazard climate information workflow generation using agentic Large Language Models (LLMs) and knowledge graphs, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2819, https://doi.org/10.5194/egusphere-egu26-2819, 2026.

EGU26-3316 | Orals | ITS1.6/ESSI1.6

A Living AI Platform for the Earth System Science 

Özge Kart Tokmak, Levke Caesar, and Boris Sakschewski

Earth system science relies on the integration of knowledge from many branches of geoscience, including climate dynamics, hydrology, ecology, land use and biogeochemical cycles. However, the scientific literature informing these domains has become vast and increasingly difficult to navigate due to its rapid development and disciplinary spread. This complexity makes it difficult to maintain an integrated overview of relevant findings and to identify scientific connections in a systematic manner. Recent advances in generative artificial intelligence (AI) and large language models (LLMs) provide opportunities to support these tasks, particularly when combined with retrieval methods and transparent source attribution.

Here we propose a retrieval-augmented AI platform designed to assist scientific knowledge integration in Earth system science. The platform is conceived as a living system, built on a continuously expanding and updateable knowledge base that aggregates scholarly literature from major scientific databases. User queries initiate targeted retrieval of relevant documents followed by the generation of concise, source-linked summaries using locally hosted open-weighted LLMs. By explicitly grounding outputs in retrieved literature, the platform alleviates the need for manual screening and limits hallucination risks that currently constrain the use of general-purpose LLMs in geoscientific research.

Evaluation of the initial prototype demonstrates that domain-specific retrieval-augmented generation systems can provide reliable, traceable synthesis of Earth system knowledge and help address the growing gap between accelerating publication rates and the need for timely, verifiable scientific assessment.

How to cite: Kart Tokmak, Ö., Caesar, L., and Sakschewski, B.: A Living AI Platform for the Earth System Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3316, https://doi.org/10.5194/egusphere-egu26-3316, 2026.

This study investigates the domain adaptation of the Vision-Language Model (VLM) for road damage assessment, focusing on a fine-tuning strategy optimized for resource-constrained engineering environments. Unlike conventional object detection models that operate within fixed label spaces, VLMs provide superior semantic understanding and generalization in complex scenarios. To facilitate practical deployment, this research systematically analyzes key variables of Parameter-Efficient Fine-Tuning (PEFT) to mitigate the high computational demands inherent in large-scale VLMs.

In the experimental phase, hyperparameter tuning was conducted using the Low-Rank Adaptation (LoRA) technique. The primary variables included LoRA ranks (16, 32, 64, and 96), training data scale, and image resolutions (1,024ⅹ28ⅹ28 vs. 1,536ⅹ28ⅹ28). A comprehensive dataset of 26,796 images comprising six damage categories and negative samples was established, utilizing a 7n sampling strategy (n=500, 750, 1,000) to address class imbalance. The impact of data volume was evaluated by augmenting the 7,000-sample set (corresponding to n=1,000) to match the full dataset size of 26,796, with zero-shot inference serving as the performance baseline.

Experimental results demonstrated substantial improvements over zero-shot inference, indicating that performance positively correlates with increased data scale with augmentation and higher image resolution, while lower LoRA ranks (16, 32) proved most effective for this domain. Furthermore, the introduction of specialized ad-hoc metrics, MmAP and MF1, verified a stable trade-off between False Positives and False Negatives. Notably, to minimize safety-critical False Negatives, a prompt engineering-based 'Double Check' mechanism and multi-turn interactions were utilized. This approach successfully leveraged the model’s inherent reasoning capabilities to refine damage identification through iterative feedback.

Acknowledgements This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(RS-2025-25437298)

How to cite: Kim, D. and Youn, H.: Optimizing Vision-Language Model for Robust Road Damage Assessment via Parameter-Efficient Fine-Tuning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3674, https://doi.org/10.5194/egusphere-egu26-3674, 2026.

EGU26-4407 | ECS | Posters on site | ITS1.6/ESSI1.6

Policy-oriented Land-Use and Agricultural Management Scenarios for Groundwater Nitrate Hotspot Mitigation 

Amir Naghibi, Kourosh Ahmadi, and Ronny Berndtsson

Nitrate contamination of groundwater is often addressed as a diffuse agricultural management problem, yet monitoring in Denmark depicts that exceedance risk at abstraction and observation wells is spatially structured and closely linked to surrounding land-use composition and configuration. This suggests a land-use policy opportunity: if landscape fractions and fragmentation patterns help drive nitrate vulnerability, interventions could be spatially targeted and tailored rather than uniformly applied. In this study, we present a scenario-based planning framework for policy appraisal, enabling regulators, municipalities, and water utilities to test alternative policy packages and targeting rules and quantify their expected effects on groundwater nitrate hotspot risk. The system operates a predictive pipeline that relates nitrate outcomes to land-use fractions and landscape configuration metrics computed within configurable protection zones. Model outputs are formulated as a binary hotspot classification (hotspot vs. non-hotspot) based on exceedance of a drinking-water nitrate threshold, producing vulnerability maps to prioritize locations for intervention and prevention. The core functionality is a “what-if” engine built on an AI-based ensemble that generates a baseline nitrate-risk probability map and re-predicts risk under user-defined scenarios. Scenario levers are organized into two policy bundles: (i) land-use policy and management, implemented as controlled reallocations among land-cover fractions (e.g., reducing large contiguous cropland blocks, increasing wetland/riparian woodland cover, restricting impervious expansion) while enforcing feasibility constraints; and (ii) agricultural management, implemented as proportional reductions or caps on nitrogen surplus and fertilizer inputs. For each scenario, the system outputs an updated probability map and a different map relative to baseline, supporting spatial prioritization, instrument design, and transparent justification of differential targeting. By combining ex-ante scenario testing with ex-post monitoring of hotspot transitions after implementation, the framework supports adaptive groundwater governance and moves from risk mapping toward operational, spatially explicit nitrate-reduction policy design.

How to cite: Naghibi, A., Ahmadi, K., and Berndtsson, R.: Policy-oriented Land-Use and Agricultural Management Scenarios for Groundwater Nitrate Hotspot Mitigation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4407, https://doi.org/10.5194/egusphere-egu26-4407, 2026.

AI-driven weather and climate prediction has become highly visible in recent years, and most Earth scientists are now familiar with learned forecast models. Far fewer are aware that the same advances in artificial intelligence are producing general-purpose systems that can autonomously review literature, write and debug code, design experiments, and carry out extended research tasks with minimal supervision. These capabilities may ultimately have a greater impact on everyday scientific practice than any single prediction model.

AI-based forecasting represents only a narrow entry point into a broader transformation driven by hybrid intelligence, in which domain-specific Earth system models are combined with general AI systems such as large language and multimodal models and autonomous agents. In practice, this hybrid intelligence already spans simulation, data assimilation, downscaling, and analysis, while general AI systems increasingly handle coding, synthesis, and workflow orchestration. Together, these systems function less as isolated tools and more as adaptive research partners. Drawing on examples from NVIDIA’s Earth-2 research program and related international efforts, this talk examines how this shift reconfigures the human role toward problem formulation, validation, interpretation, and ethical governance, and highlights practical AI-assisted workflows already reshaping research productivity. Framing AI for environmental prediction within this wider context invites a broader discussion of how hybrid intelligence should be integrated thoughtfully into future Earth system science.

How to cite: Hall, D.: Beyond the Forecast: Hybrid Intelligence as a Force Multiplier for Earth Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5926, https://doi.org/10.5194/egusphere-egu26-5926, 2026.

The Yellow River Basin (YRB) serves as a critical repository of literature for understanding human-earth systems, yet existing automated metadata-level review methods suffer from deep semantic loss and deficiencies in spatial representation: They neither capture fine-grained logic chains from full texts nor possess the capability to extract the spatial and hierarchical attributes of geographic entities. However, rapid developments in Large Language Models (LLMs) provide a technological opportunity for the automated extraction of full-text knowledge. To this end, this study proposes the Geo-Knowledge Infused Reasoning Framework (GK-IRF), coupling full-text semantics with multi-level spatial indexing. Methodologically, we first construct an ontology-based full-text parsing mechanism based on 8,493 YRB-related papers (2015-2024), utilizing LLMs to accurately extract structured semantic triplets. Simultaneously, we introduce an adaptive multi-level GeoHash indexing model to map textual toponyms into hierarchically nested grid sets, reconstructing the spatial coverage and multi-scale associations of geographic entities. Validations against a manually annotated dataset indicate that GK-IRF achieves an F1-score comparable to human performance in full-granularity semantic extraction; furthermore, the Spatial Coverage Accuracy of the multi-level grids for the YRB substantially outperforms traditional geocoding methods, effectively resolving the challenge of multi-scale coverage representation.

How to cite: Wu, S. and Wang, H.: Coupling Full-Text Semantics with Multi-Level Spatial Indexing: A Knowledge Representation Framework for Yellow River Basin Literature, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6201, https://doi.org/10.5194/egusphere-egu26-6201, 2026.

EGU26-6459 | ECS | Posters on site | ITS1.6/ESSI1.6

Geoscience-Aware AI for Interpretable Seismic Interpretation of Mass Transport Deposits Using Knowledge Graphs and Large Language Models 

Feryal Batoul Talbi, John Armitage, Jean Charléty, Alain Rabaute, Antoine Bouziat, Jean-Noël Vittaut, and Sylvie Leroy

Seismic interpretation of mass transport deposits (MTDs) relies heavily on expert knowledge and conceptual reasoning yet remains difficult to formalize and scale. While recent artificial intelligence (AI) methods have shown strong capabilities in seismic pattern recognition, most approaches operate as black boxes and remain poorly aligned with the interpretative frameworks used by geoscientists, limiting transparency and trust.

 

This study proposes a geoscience-aware hybrid intelligence framework that integrates expert knowledge graphs (KGs) with large language models (LLMs) to support interpretable seismic interpretation of MTDs. The approach builds upon the conceptual methodology of Le Bouteiller et al. (2019), which organizes MTD interpretation through causal relationships linking environmental controls, mass transport properties, and observable seismic descriptors across trigger, transport, and post-deposition phases.

 

The KG provides a structured reference for interpretation that constrains vocabulary, causal direction, and temporal logic. Our workflow reads scientific papers, identifies relevant descriptors and processes, checks them with LLMs, and evaluates how well they support interpretation. In this setup, seismic descriptors give different levels of support (weak to strong) for geological processes, like how experts reason under uncertainty.

Preliminary results show that ~68% of expert defined concepts are recovered in the inferred graph, with a semantic validation score of 0.73, indicating good conceptual alignment. However, descriptor matching based on textual similarity remains difficult, with average scores around 0.41. This gap highlights the difference between semantic agreement (conceptually correct) and textual agreement (exact wording), mainly due to synonymy and variable phrasing in the literature. We plan to address this by using domain-specific LLMs and ontology-based synonym expansion to improve semantic matching in future iterations

How to cite: Talbi, F. B., Armitage, J., Charléty, J., Rabaute, A., Bouziat, A., Vittaut, J.-N., and Leroy, S.: Geoscience-Aware AI for Interpretable Seismic Interpretation of Mass Transport Deposits Using Knowledge Graphs and Large Language Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6459, https://doi.org/10.5194/egusphere-egu26-6459, 2026.

The assessment of environmental and resource performance of energy transition technologies relies on quantitative information scattered across heterogeneous sources, including scientific articles, patents, and industrial reports, such as ESG (Environmental, Social, and Governance) disclosures. These documents contain key data for Life Cycle Inventory (LCI) and Material Flow Analysis (MFA), such as material and energy intensities, water consumption, mining production volumes, emissions, and technological descriptors. However, this information is predominantly embedded in unstructured PDF documents optimized for human reading, making large-scale, traceable data aggregation difficult and costly when performed manually.

This work presents an automated and modular methodology designed to extract and contextualize quantitative LCI and MFA data from three major categories of technical documentation. The approach combines large-scale document collection, relevance screening, and multimodal artificial intelligence within a reproducible and auditable workflow.

  • Scientific Articles

Peer-reviewed articles are collected through automated scraping workflows based on structured search outputs. Documents are screened for LCI/MFA relevance using domain-specific keywords, methodological markers, and quantitative signal density. Relevant articles are then processed using a multimodal AI-based extraction core in which each page is analyzed through a combined text and image input. This enables robust extraction of numerical values from tables, text and figures while preserving contextual information such as units, methodological assumptions, and source location.

  • Patents

Patent documents contain information about future trends on technologies and metal uses. Patents are collected via dedicated scraping pipelines and processed separately from scientific articles. The workflow focuses on extracting and structuring patent metadata, including publication year, country, and technology class, in order to characterize technological activity related to energy transition technologies. While quantitative LCI/MFA extraction from patents is not yet systematically performed, the pipeline enables descriptive statistical analyses of patent dynamics, including temporal trends and geographical patterns of technological development.

  • Mining technical and ESG Reports

Official mining companies reports, with a specific focus on ESG ones, are processed through a screening module acting as a gatekeeper. The screening relies on sequential text parsing and, when necessary, geometric reconstruction of tables to identify reports containing sufficiently granular and structured quantitative information. Following human validation of the screening results, selected reports are analyzed using a IA-multimodal vision–language model combining page images and extracted text, enabling structured extraction of industrial metrics with associated context and traceability.

This automated methodology addresses one of the core challenges of data collection and significantly improves the granularity, consistency, and verifiability of LCI datasets and MFA inputs. The application of methodology is illustrated through examples related to battery and hydrogen technologies based on scientific articles and patents, and through case studies on copper and nickel production with a focus on mining based on industrial report. Although applied for LCA and MFA, the approach can also support the extraction of other types of data and indicators relevant to environmental and resource analyses. The tool provides automated and reliable support for researchers aiming to extract comprehensive foundational data from heterogeneous sources.

How to cite: Bejjit, C. E., Monfort, D., Muller, S., Lai, F., Beylot, A., and Hennioui, D.: Mining and raw materials sector: Automated Data Extraction and Contextualization for Life Cycle Inventory (LCI) and Material Flow Analysis (MFA) Across Scientific Articles, Patents and Mining companies Reports, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6501, https://doi.org/10.5194/egusphere-egu26-6501, 2026.

EGU26-8579 | ECS | Posters on site | ITS1.6/ESSI1.6

Towards a Mechanism-Informed Intelligent Framework for Identification of Compound Drought-Heat Extremes in Croplands 

Haobin Xia, Jianjun Wu, Litao Zhou, and Ruohua Du

Compound drought and heat extremes (CDHEs) exert impacts that exceed the sum of their individual components. With global warming amplifying the associated risks, CDHEs have become a critical threat to agricultural production. Thus, identifying and monitoring CDHEs in cropland systems is key for food security. As CDHEs formation and evolution are shaped by climatic factors, hydrological cycles, and ecosystem feedbacks, their fine- and large-scale identification in agricultural areas poses substantial challenges.

Our study reviews existing methods for identifying CDHEs, including combined threshold approaches, comprehensive index methods, traditional machine learning techniques, and improved mechanistic modeling. We summarize the current limitations of these methods as follows: (1) Combined threshold and comprehensive index methods often focus on a single aspect of CDHEs, failing to systematically describe the complex processes of compound events. (2) While traditional machine learning methods attempt to integrate characteristics of the hazard-bearing body, disaster-causing factors, and hazard-inducing environment to establish complex nonlinear relationships between multiple elements and compound event indices, their "black-box" nature lacks mechanistic interpretability. Furthermore, these methods rely heavily on large volumes of high-quality samples to achieve satisfactory accuracy. (3) Improved mechanistic models, typically based on classical agricultural process models such as APSIM and AquaCrop, introduce CDHE impact modules to address the oversimplification of these effects in original models. Nevertheless, these mechanistic models require extensive input parameters, and their calibration processes depend on substantial amounts of measured data. Additionally, the computational resources needed for simulations are considerable, making the cost of analyzing CDHEs over large farmland areas under various future climate scenarios prohibitive for individual researchers.

To address these challenges, this study highlights the potential of physics-informed neural network models for identifying compound events and proposes future research directions regarding mechanistic constraints, neural network architecture design, and experimental plans: (1) Farmland CDHEs are essentially phenomena of water and heat imbalance within the soil-crop-atmosphere (SCA) system. Utilizing the Richards equation and the Penman-Monteith formula can characterize this process by constraining the water and heat environmental factors at the two key interfaces: root-soil and leaf-atmosphere. (2) Solar-induced chlorophyll fluorescence (SIF), a byproduct of vegetation photosynthesis closely related to GPP, responds rapidly to physiological damage caused by stress. Utilizing multi-band SIF data can provide a detailed depiction of crop physiological responses to stress from the perspective of the hazard-affected body. (3) Automated design of model architectures incorporating mechanistic information for farmland compound events can be achieved through distillation learning. (4) Future work should integrate ground-based water and heat control experiments with site-specific hyperspectral SIF observation data. Through continuous combinatorial experimental design, this approach can lead to the development of accurate and efficient physics-informed neural networks. Coupled with large-scale satellite and reanalysis data products, this framework aims to enable the large-area identification of farmland CDHEs under future climate scenarios.

How to cite: Xia, H., Wu, J., Zhou, L., and Du, R.: Towards a Mechanism-Informed Intelligent Framework for Identification of Compound Drought-Heat Extremes in Croplands, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8579, https://doi.org/10.5194/egusphere-egu26-8579, 2026.

EGU26-8976 | ECS | Posters on site | ITS1.6/ESSI1.6

OpenEM: Large-scale multi-structure 3D dataset for electromagnetic methods 

Shuang Wang, xuben Wang, Fei Deng, and Peifan Jiang

Electromagnetic methods are among the most widely used techniques in the geophysical exploration industry due to their efficiency and non-invasive nature. However, their data processing workflows are highly time-consuming and strongly dependent on expert intervention. With the rapid and broad success of deep learning, applying deep learning techniques to electromagnetic methods to overcome the limitations of traditional approaches has become an active area of research. The effectiveness of deep learning methods, however, largely depends on the quality of the dataset, which directly influences model performance and generalization capability. Existing applications typically rely on self-constructed datasets composed of randomly generated one-dimensional models or structurally simple three-dimensional models, which fail to capture the complexity of realistic geological environments. Moreover, the absence of a unified and publicly available three-dimensional geoelectrical model repository has further constrained the development of deep learning for three-dimensional electromagnetic exploration. To address these challenges, we introduce OpenEM, a large-scale, multi-structural three-dimensional geoelectrical model repository that incorporates a wide range of geologically plausible subsurface structures.

OpenEM comprises nine categories of geoelectrical models, encompassing a wide spectrum of subsurface structures ranging from simple to complex. These include models of homogeneous half-spaces with embedded anomalous bodies, as well as configurations featuring flat stratigraphy, curved stratigraphy, planar faults, curved faults, and their variants containing anomalous bodies. The resistivity values span from 1 to 2000 Ω·m, with the number of layers ranging from three to seven. In models containing anomalous bodies, the number of anomalies varies from one to five, and both regular and irregular geometries are considered to enhance dataset diversity and realistic representativeness. In addition, OpenEM is accompanied by a three-dimensional model generator that enables fully controllable model construction, allowing users to customize structural configurations, including resistivity magnitudes, fault geometries and locations, as well as the size, shape, and placement of anomalous bodies.

OpenEM provides a unified, comprehensive, and large-scale dataset for common electromagnetic exploration systems, thereby promoting the application of deep learning methods in electromagnetic prospecting.

How to cite: Wang, S., Wang, X., Deng, F., and Jiang, P.: OpenEM: Large-scale multi-structure 3D dataset for electromagnetic methods, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8976, https://doi.org/10.5194/egusphere-egu26-8976, 2026.

Typically, to start working on a remote sensing–based application, various analyses and insights are needed from domain experts. A significant amount of time and effort goes into preprocessing, structuring, and analyzing the data, which can be a repetitive task, especially when a multi-sensor approach is involved. This often takes away time that could otherwise be invested in innovation or research. To address this, training an LLM to understand and process the context of remote sensing tasks can improve efficiency and reduce human-induced errors.

In this work, we develop an AI agent that can reason and think like a remote sensing expert. This agent uses a RAG-based foundational model (FM) and is equipped with various image processing tools to complete a task. We use gpt-4.1-mini as the FM and the Agno framework to deploy the agent. The knowledge base provided to this agent is specially curated with relevant research articles, books, and remote sensing methodologies. This knowledge base helps the model break down a problem into logical steps that can be performed using the tools available within the agent.

These tools can download data, process it, and provide relevant statistics and visualizations. The user can prompt the agent to download multi-sensor (optical and SAR) data, perform time-series analysis for forest monitoring, and identify deforestation hotspots. The agent can fetch data from Google Earth Engine (GEE), plan processing workflows, dynamically generate Python code, and complete the prompted tasks. This approach highlights the feasibility of integrating LLMs with domain-specific knowledge bases and geospatial processing tools to create autonomous, context-aware systems. Figure 1 depicts the overall workflow of the proposed agentic system, illustrating the interaction between the user, the knowledge base, the foundational model, and the integrated processing tools. The framework is directly usable for operational forest monitoring applications and can be further fine-tuned and extended to support a broader range of environmental monitoring and geospatial analytics use cases.

                                     

                                        Figure 1: Workflow of the Agentic AI system

 

How to cite: Jain, A. and Sabir, A.: Development of a Context-Aware AI Agent for Forest Applications Using Multi-Sensor Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10390, https://doi.org/10.5194/egusphere-egu26-10390, 2026.

Multi-agent GIS systems are increasingly emerging as a general paradigm for complex geospatial tasks. However, many existing approaches rely on text-only large language models (LLMs) as the primary reasoning substrate. In the absence of explicit geometric constraints and verifiable evidence, spatial relations are often indirectly represented through linguistic statistical correlations. This makes LLMs prone to inconsistency when interpreting and inferring topological, directional, and distance relations in geospatial data, and leads to error accumulation across multi-step tool invocations and long-horizon decision-making, ultimately degrading the accuracy and efficiency of task reasoning and execution. In this work, we propose VisCritic-GIS, a multi-agent framework for geospatial task reasoning and execution driven by visualized evidence review. VisCritic-GIS introduces a Visualization Generation Agent and a Visualization Critic Agent into conventional multi-agent GIS pipelines. The generation agent renders key spatial data and intermediate results into 2D maps, explicitly externalizing spatial relations in a visual form. The critic agent leverages multimodal LLMs to read and critically review these map-based evidence, producing textual feedback on spatial relations, anomalous results, and reasoning deviations, which constrains and drives iterative refinement of other agents’ reasoning trajectories and toolchain configurations. We build evaluation protocols over representative remote sensing and geospatial tasks, and systematically demonstrate that VisCritic-GIS improves task accuracy, execution efficiency, and interpretability. Overall, our framework provides a mechanism for shifting geospatial reasoning from “text-only probabilistic completion” toward “visually grounded, verifiable inference,” thereby strengthening the robustness of spatial relation understanding in multi-agent GIS systems.

How to cite: Lan, Q., Hu, L., Wu, S., and Du, Z.: VisCritic-GIS: A Visualization-Critic–Empowered Framework for Multi-Agent Geospatial Task Reasoning and Execution, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10473, https://doi.org/10.5194/egusphere-egu26-10473, 2026.

EGU26-13454 | Orals | ITS1.6/ESSI1.6

Agentic AI for Earth-Observation-Driven Maritime Monitoring - the SeaScope Project 

Christos Sekas, Kostas Philippopoulos, Ilias Agathangelidis, Constantinos Cartalis, Stelios Neophytides, and Michalis Mavrovouniotis

We present SeaScope, an explainable AI agent that accelerates interaction with complex Earth Observation (EO) workflows. Users express analytical questions in natural language, which are transformed into transparent, executable EO analyses. By combining generative AI, vision–language models, and Retrieval-Augmented Generation (RAG), SeaScope links scientific literature, satellite data descriptions, and validated analysis methods to automatically generate, execute, and explain EO workflows. For example, a query such as “Detect vessel activity and possible oil spills in May 2025” triggers dataset selection, code generation, cloud execution, and map outputs with traceable reasoning.

 

SeaScope is designed as a geoscience-specific AI agent that supports both rapid decision-making and accelerated research. Non-technical users can obtain EO-based insights in time-critical situations without continuous involvement of expert programmers, while researchers benefit from faster hypothesis testing, automated pipeline generation, and reproducible workflows. Human expertise remains central: users inspect retrieved sources, review generated code, and validate analytical steps, ensuring scientific control and accountability. This setup combines domain knowledge with AI-driven scalability, addressing challenges such as sensor-specific scripts and fragmented tools.

 

As a pilot use case, SeaScope is applied to maritime EO in the Mediterranean region, supporting environmental monitoring and marine activity analysis using satellite data. Beyond the application, the project delivers research insights on generative and vision-based AI for EO, including lessons learned from benchmarking LLMs for code generation, evaluating vision-language models for image understanding, and comparing different RAG and knowledge ingestion strategies. The findings highlight practical trade-offs in accuracy, robustness, explainability, and user validation in real-world workflows.

How to cite: Sekas, C., Philippopoulos, K., Agathangelidis, I., Cartalis, C., Neophytides, S., and Mavrovouniotis, M.: Agentic AI for Earth-Observation-Driven Maritime Monitoring - the SeaScope Project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13454, https://doi.org/10.5194/egusphere-egu26-13454, 2026.

EGU26-13612 | Posters on site | ITS1.6/ESSI1.6

Integrating Large Language Models into Climate and Geoscientific Data Workflows 

Ivan Kuznetsov, Dmitrii Pantiukhin, Jacopo Grassi, Boris Shapkin, Thomas Jung, and Nikolay Koldunov

Large Language Models (LLMs) have emerged as powerful tools for text and data processing, with potential extending far beyond conversational interfaces. We demonstrate that integrating LLMs into agentic workflows enables automated climate and oceanographic data analysis while minimizing hallucinations through strict reliance on real data sources.

ClimSight combines LLMs with climate model data to deliver localized climate insights for decision-making. Specialized agents consult external databases, extract variables from climate models, generate Python scripts for post-processing, and validate outputs through visual analysis. The workflow iteratively corrects errors until reliable results are achieved.

PANGAEA GPT enhances accessibility to the PANGAEA data repository through a supervisor agent that interprets queries, delegates tasks to domain-specific subagents, and coordinates data extraction, statistical analysis, and visualization of oceanographic and atmospheric datasets.

Both systems leverage automatic Python execution and image analysis for quality control. By constraining outputs to verifiable data sources and implementing multi-agent verification, we demonstrate that LLMs can play a significant role in geoscientific data pipelines and automated research workflows.

 

How to cite: Kuznetsov, I., Pantiukhin, D., Grassi, J., Shapkin, B., Jung, T., and Koldunov, N.: Integrating Large Language Models into Climate and Geoscientific Data Workflows, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13612, https://doi.org/10.5194/egusphere-egu26-13612, 2026.

Generative AI is now woven into the daily study practices of geoscience students, often more deeply than educators acknowledge. This study examines how bachelor students in Earth Sciences (GEOL1008, NTNU) and master students in Engineering Geology (TGB4200, NTNU) use AI tools to understand literature, analyse data, synthesise research findings, and prepare written and oral assignments. The analysis draws on two structured surveys designed to map the extent and character of AI use in both cohorts.

Preliminary results indicate that AI has become the default support tool. Students turn to it to decode complex concepts, troubleshoot coding tasks, analyze data, structure reports, and polish presentations. Many see little distinction between traditional digital tools and generative AI, and the boundary between personal work and AI-augmented work is increasingly blurred. At the same time, students express uncertainty and worry about ethical expectations, disclosure practices, and the legitimacy of relying heavily on AI in academic work.

These trends have immediate consequences for assessment. Home exams, reports, and pre-prepared presentations no longer reliably reveal individual understanding, since nearly all students now use AI during preparation. Emerging evidence from portfolio-based courses suggests grade inflation and reduced differentiation between students, not because learning outcomes have improved, but because AI elevates the baseline quality of submitted work. In practice, written in-person exams and oral examinations remain among the few ways to assess unassisted reasoning.

The findings underscore a need to rethink teaching and assessment in geoscience education. AI is not a future challenge but a present reality, and universities must adapt if they aim to evaluate what students actually know rather than what their tools can produce.

How to cite: Fredin, O.: Geoscience Education in the Age of Generative AI: What Do Students Actually Learn?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16861, https://doi.org/10.5194/egusphere-egu26-16861, 2026.

EGU26-17176 | ECS | Posters on site | ITS1.6/ESSI1.6

Bridging the gap between scientists and large language models 

István Bozsó, András Horváth, and Lukács Kuslits

Large Language Models (LLMs), a class of contemporary artificial intelligence systems, are increasingly used in scientific practice to support research workflows, accelerate discovery, and automate routine administrative tasks. This contribution identifies and analyzes three underexplored aspects of LLM adoption in scientific research. The first aspect concerns the uneven adoption of LLMs among scientists and the inconsistent application of established best practices. The second examines how LLMs can be employed to improve the robustness and reproducibility of scientific practices. The third addresses institutional strategies by which large scientific organizations—such as universities and research networks—can reduce dependence on commercial technology providers while increasing trust in LLM-based systems.

The findings found in our contribution are the partly summarization of István Bozsó’s experiences serving in the role of an “AI ambassador” at the Institute of Earth Phyisics and Space Science (EPSS) of the Hungarian Research Network (HUN-REN).

In our experience many scientists are still skeptical of using LLMs in any capacity or lack the time to invest in learning these technologies. These barriers are primarily sociotechnical rather than purely technical in nature, and they require, on one hand materials that teach best-practices and show motivating examples for using LLMs, on the other hand services provided by research organisations.

Recent advances in open-weight LLMs enable self-hosting within institutional computing infrastructures, which means research institutes can run these models on their own hardware and thereby ensuring that sensitive data and research materials remain within the organization’s controlled digital environment. This also ensures that the LLM usage stays independent of Large Technology corporations and builds trust with colleagues.

Regarding motivating examples, we wish to focus on two areas which can be addressed with the help of LLMs. One area is scientific communication. LLMs can easily generate materials (primarily text, sound and video) which can be used to inform the wider public on new scientific discoveries and push back against misinformation and disinformation campaigns. The involment of scientists is paramount in the review and finalization of such materials to ensure they represent accurate scientific information.

The other area is scientific programming. Many scientists are not trained as professional software engineers and often lack the time and background to apply software development best practices. In many cases this results in software artifacts that are fragile, difficult to reproduce, and challenging to maintain and usually only work on the machine of the researcher who developed the package. LLMs can help out in these situations by suggesting and even implementing best practices and giving programming advice to the researcher during the development of the scientific code.

The common theme in these examples is that the LLM is not meant to replace the scientist but enhance their capabilities with the goal of increasing the robustness, transparency, and sustainability of the scientific research process.

How to cite: Bozsó, I., Horváth, A., and Kuslits, L.: Bridging the gap between scientists and large language models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17176, https://doi.org/10.5194/egusphere-egu26-17176, 2026.

Urban public spaces are highly dynamic systems where traffic patterns, pedestrian flows, and human activities vary strongly across temporal scales. Capturing these dynamics at high temporal resolution remains challenging, particularly using low-cost and reproducible observation methods. In this study, we present an automated workflow for continuous urban activity monitoring based on publicly available webcam imagery and deep learning–based object detection.

A public webcam overlooking Augustusplatz, a central urban square in Leipzig (Germany), is continuously accessed, and still frames are extracted from the video stream at one-minute intervals. Each frame is processed using the YOLO11 object detection model to identify and count relevant object classes, including passenger vehicles and pedestrians. The detection results are converted into structured JSON records and enriched with metadata such as timestamp and geographic location. All data are stored in an InfluxDB time-series database and visualized and statistically analyzed using Grafana.

This setup enables near-real-time and long-term analysis of urban activity patterns across multiple temporal scales. Distinct signatures of recurring and episodic events can be identified, including daily commuting cycles, evening rush hours, road closures, public celebrations, and large seasonal events such as Christmas markets. The minute-scale resolution allows for detailed investigation of short-term dynamics, while continuous operation over longer periods enables comparative and trend analyses.

The presented approach demonstrates how publicly available visual data and open-source tools can be combined into a scalable and transferable framework for urban monitoring. Potential applications include event detection, urban mobility analysis, validation of traffic models, assessment of public space usage, and integration with other environmental or socio-economic datasets. The method provides a cost-efficient complement to traditional urban sensing infrastructures and offers new opportunities for data-driven urban and environmental research.

How to cite: Oesen, B., Wagner, R., and Goblirsch, T.: High-Temporal-Resolution Urban Activity Monitoring Using Public Webcams and Deep Learning: A Case Study from Leipzig, Germany, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17514, https://doi.org/10.5194/egusphere-egu26-17514, 2026.

EGU26-18439 | Orals | ITS1.6/ESSI1.6

Integrating Machine Learning and Large Language Models for Next-Generation Water & Environmental Intelligence 

Gerald A Corzo P, Emmanouil Varouchakis, Anna Kamińska-Chuchmała, Rozalia Agioutanti, and Valentina Dominguez

Climate change, land-use change, and increasing socio-economic pressures are reshaping water and environmental systems, while the volume and heterogeneity of available data—from in situ observations to reanalysis products, remote sensing, and citizen-generated sources—continue to grow. Machine learning (ML) has become an important component of hydro-environmental modelling for forecasting, classification, and pattern discovery. However, in practice, many ML applications remain highly case-specific and dependent on implicit expert decisions related to problem formulation, predictor selection, validation design, and interpretation, which are rarely made explicit or transferable across regions and users.

This contribution presents a human-in-the-loop hybrid intelligence framework that integrates ML workflows with Large Language Models (LLMs) to support structured reasoning during environmental model development and evaluation. Rather than using LLMs for automated optimisation or model selection, the framework positions them as a guidance and scaffolding layer that helps make modelling assumptions, choices, and limitations explicit and traceable, while retaining expert control over all final decisions.

Methodologically, the framework combines (i) hands-on ML pipelines, ranging from baseline statistical models to more advanced learning algorithms for forecasting and classification, and (ii) an LLM-based guidance layer that structures expert reasoning through prompts, checklists, and decision logs. This guidance supports key stages of the modelling process, including the definition of modelling objectives, assessment of data quality, selection of environmentally meaningful predictors, and the design of validation strategies. Particular emphasis is placed on encouraging validation schemes that account for temporal dependence and spatial heterogeneity, such as blocked or spatial cross-validation, rather than default random data splits.

The framework is currently being developed and iteratively evaluated through expert-led case studies using real hydro-environmental datasets, rather than through formal classroom deployment. Initial applications focus on groundwater level analysis and hydro-environmental forecasting problems in Greece, including collaborative work in Crete, where the framework has been used to structure modelling choices and interpret model behaviour under non-stationary conditions. Additional exploratory applications using existing datasets have been used to stress-test the transferability of the workflow across contrasting environmental settings. Ongoing extensions include the application of the framework within coastal erosion modelling activities currently being developed in Colombia.

The LLM layer supports explicit reasoning about why a model performs well or poorly under specific conditions, how assumptions propagate into uncertainty, and where data-driven learning diverges from physical expectations. This reflective use of hybrid intelligence helps expose failure modes and modelling sensitivities that are often hidden in automated pipelines.

Results from the expert-led evaluations indicate that the proposed framework improves the transparency and reproducibility of modelling decisions, facilitates comparison across case studies, and supports more consistent interpretation of ML results across regions and scales. At the same time, the approach lowers the entry barrier for non-specialists without removing expert oversight or domain judgement.

The framework is being developed within the context of the Erasmus+ AI-LEARN project (Project reference: 2025-1-NL01-KA220-HED-000355215), where it serves as a methodological backbone for future training and capacity-building activities in water and environmental intelligence.

How to cite: Corzo P, G. A., Varouchakis, E., Kamińska-Chuchmała, A., Agioutanti, R., and Dominguez, V.: Integrating Machine Learning and Large Language Models for Next-Generation Water & Environmental Intelligence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18439, https://doi.org/10.5194/egusphere-egu26-18439, 2026.

EGU26-19954 | Posters on site | ITS1.6/ESSI1.6

Research on Key Technologies for High-Precision Land Cover Change Monitoring Using Satellite Data 

Shucheng You, Lei Du, Yun He, and Fanghong Ye

The increasing availability of satellite remote sensing data has made automatic land cover change detection a persistent research focus. However, real-world applications show that single AI models struggle to cope with the combined challenges of spatial-temporal complexity, feature diversity, and evolving engineering requirements. Consequently, the accuracy of automatically extracted land cover changes is often compromised, making the results insufficient for direct engineering application. Guided by practical application need, this paper focuses on how to utilize satellite remote sensing data, various knowledge and AI technologies to improve the accuracy and efficiency of automatic land cover extraction. This research focuses on the key technologies involved in the complete land cover monitoring process. Central to this study is the proposal of a progressive intelligent change detection technology for satellite remote sensing, characterized by a “identify all, discriminate precisely, refine extraction” workflow. Specifically, the “identify all” step extracts all potential change patches using models such as generic binary change detection. Building on these results, the “discriminate precisely” step filters out patches that are not of current interest. Finally, the “refine extraction” step employs models like semantic segmentation to further screen the results and enhance overall accuracy. An application demonstration in Shanxi Province, China, for new PV facilities, buildings, and roads demonstrated a recall rate of 89.3% for automatic extraction. The high-quality outputs confirm the practical applicability of the results. Consequently, this research affirms the technology as both a valuable and transferable solution for land cover monitoring.

How to cite: You, S., Du, L., He, Y., and Ye, F.: Research on Key Technologies for High-Precision Land Cover Change Monitoring Using Satellite Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19954, https://doi.org/10.5194/egusphere-egu26-19954, 2026.

EGU26-20206 | ECS | Posters on site | ITS1.6/ESSI1.6

Building Connected Earth Observation Ecosystems with Agentic AI using EVE 

Eva Gmelich Meijling, Riccardo D'Ercole, Anca Anghelea, Chiara Maria Cocchiara, and Nicolas Longepe

This study explores the integration of EVE (Earth Virtual Expert), a Large Language Model specialized in Earth Observation (EO) and Earth Sciences, developed under ESA’s Φ-lab #AI4EO initiative in collaboration with Pi School. The primary objective is to enable EVE to connect ESA’s EO platforms and data clusters, creating an integrated ecosystem for the community. This approach leverages agentic capabilities, allowing EVE to dynamically interact with EO tools, databases, and APIs to reason and act autonomously.
To demonstrate this concept, we present a use case where EVE operates within an agentic framework to interact with the EO Dashboard, a joint initiative by ESA, NASA, and JAXA that provides global indicators and narratives derived from multi-mission EO data. Using the MCP protocol, this work enables dynamic connectivity between EVE and the Dashboard, allowing the model to interpret and summarize narratives, extend insights with additional context, and facilitate advanced information retrieval across datasets and stories. In addition, the study considers potential directions for agentic behaviors, assessing early-stage possibilities and limitations for features such as autonomous task chaining. These capabilities enable EVE to perform multi-step reasoning, for example, by interpreting quantitative trends in dashboard indicators such as air quality changes, greenhouse gas concentrations, or land cover dynamics. This links EVE to underlying datasets and enables the generation of scientifically grounded responses. This proof-of-concept demonstrates EVE’s potential to foster interoperability and accelerate Earth system science by improving knowledge accessibility and enabling more effective use of EO data resources.

How to cite: Gmelich Meijling, E., D'Ercole, R., Anghelea, A., Cocchiara, C. M., and Longepe, N.: Building Connected Earth Observation Ecosystems with Agentic AI using EVE, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20206, https://doi.org/10.5194/egusphere-egu26-20206, 2026.

EGU26-20992 | ECS | Posters on site | ITS1.6/ESSI1.6

Using copilot for the rapid generation of a visualisation platform to aid geospatial analyses 

Sebastian Lehner and Matthias Schlögl

The scale and heterogeneity of modern geospatial datasets, coupled with expanding suites of statistical and dynamical models, produce analysis outputs that are increasingly difficult to navigate and synthesise. We present a practical case study on using a large language model (LLM)-assisted coding tool (GitHub Copilot with GPT-5 mini within Visual Studio Code) to accelerate the development of a lightweight, HTML-based platform that visualises results from pre-calculated climate indicators.

Our starting point was a dataset comprising more than 130 climate indicators derived from gridded observations spanning over 60 years. These indicators originate from multiple meteorological variable groups (e.g., temperature, precipitation) and are aggregated at several temporal resolutions (e.g., annual, seasonal). Downstream analyses include spatiotemporal  statistics, extreme value analyses and statistical significance testing, yielding hundreds of figures that are difficult to navigate and analyse. To make these outputs tractable, we prompted Copilot to generate a simple web application for visualisation and analysis purposes. The pre-generated plots from the climate indicator workflow were displayed there in an organised way, allowing for quick filtering through all indicators and different temporal resolutions, comparing different plots next to each other and using a subpage to concisely display aggregated group plots.

The platform is embedded and deployed via a GitLab CI pipeline, ensuring reproducible updates and immediate web accessibility for collaborators and users, thereby enabling rapid and easy access to vasts amount of output results. Our process of prompting a LLM to generate a visualisation platform offers a convenient and transferable workflow to aid geospatial data analysis.

How to cite: Lehner, S. and Schlögl, M.: Using copilot for the rapid generation of a visualisation platform to aid geospatial analyses, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20992, https://doi.org/10.5194/egusphere-egu26-20992, 2026.

EGU26-2950 | ECS | PICO | ESSI1.7

Predicting Cumulative Land Subsidence and Its Spatiotemporal Relationship Using Machine Learning 

Yu-Yun Hsu, WeiCheng Lo, Jhe-Wei Lee, and Chih-Tsung Huang

Land subsidence has long been a critical environmental hazard along the southwestern coast of Taiwan, with Yunlin County being one of the most severely affected areas. In this study, Long Short-Term Memory (LSTM) neural networks are employed to develop predictive models for land subsidence. Cumulative land subsidence, groundwater-level variations, and lithological layering are considered as input features to investigate the predictive performance of the models from both temporal and spatial perspectives.

As long-term groundwater monitoring data often suffer from missing values, this study further introduces a Cue Wasserstein GAN with Gradient Penalty (CWGAIN-GP) to impute missing groundwater-level data, thereby improving the stability and completeness of subsequent prediction models. Artificial masking experiments, including continuous missing periods ranging from one month to one year and random removal of 10%–50% of the data. The results show that the average Nash–Sutcliffe efficiency (NSE) achieved by the imputation model reaches 0.897.

For temporal prediction, the land subsidence model is trained using different training lengths (one year and seven years) and variable combinations to forecast cumulative land subsidence over the following one to two years. The most recent six months of observations are used as input to predict the monthly land subsidence increment. The results indicate that longer training periods and more comprehensive input variables lead to improved model performance. The coefficient of determination (R²) for the first prediction year reaches 0.945, while for the second year—under conditions of three consecutive months of missing data—the R² remains as high as 0.923.

For spatial prediction, a multi-station training and single-station validation strategy is adopted. When predicting a target station, the three nearest neighboring stations are selected, and their observations from the most recent three months are used as inputs to predict the monthly land subsidence increment at the target station. This increment is then combined with the known cumulative subsidence from the previous month to estimate the current cumulative subsidence. The results show that the average R² for single-month predictions reaches 0.966. Even when cumulative subsidence is estimated iteratively by adding predicted monthly increments over six consecutive months, the average R² remains around 0.90, demonstrating strong spatial generalization capability of the proposed model.

Fig.1 Monthly vertical profiles of cumulative land subsidence at different depths for the Huwei (MW_HWES) station in 2021.

Overall, this study demonstrates that cumulative land subsidence can be effectively predicted by integrating temporally and spatially informed LSTM models with vertically stratified hydrogeological information. Although cumulative subsidence is used as the primary prediction target, the inclusion of groundwater-level variations and lithological layering enables the model to capture the vertical characteristics of aquifer systems and their influence on subsidence processes. The results highlight the importance of incorporating stratified subsurface information when modeling land subsidence and provide a robust framework for spatiotemporal subsidence prediction under realistic data availability constraints.

How to cite: Hsu, Y.-Y., Lo, W., Lee, J.-W., and Huang, C.-T.: Predicting Cumulative Land Subsidence and Its Spatiotemporal Relationship Using Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2950, https://doi.org/10.5194/egusphere-egu26-2950, 2026.

Soil moisture downscaling is a challenging geospatial regression task that requires accurately capturing complex spatiotemporal relationships across scales. In this study, we conduct a preliminary applicability assessment of denoising diffusion probabilistic models (DDPMs) for continuous-value geospatial regression, exploring the potential of generative modeling frameworks for soil moisture downscaling. The model learns the relationships between coarse-resolution soil moisture observations and multi-source auxiliary features, enabling the generation of high-resolution soil moisture estimates.

During training, the model uses 36 km resolution satellite soil moisture data and conditions on auxiliary variables, including normalized difference vegetation index (NDVI), land surface temperature, surface albedo, precipitation, and digital elevation model (DEM). A conditional embedding strategy is introduced to incorporate temporal information, spatial location information, and in-situ statistics into the diffusion network via feature-wise linear modulation (FiLM), enhancing the model’s ability to capture complex spatiotemporal structures while maintaining stability. During inference, a two-stage “generation–correction” pipeline is employed: high-resolution (1 km) auxiliary features are first used to generate initial predictions through the diffusion model, which are subsequently bias-corrected using in-situ station data.

The applicability assessment combines quantitative and qualitative evaluation. Quantitative metrics include unbiased mean squared error (UMSE), root mean square error (RMSE), mean absolute error (MAE), and R², while qualitative evaluation focuses on spatial pattern consistency and temporal trend representation. Experimental results indicate that the diffusion-based generative model produces reasonable, spatially coherent, high-resolution soil moisture results and successfully captures major temporal variations. These findings demonstrate the applicability of generative frameworks for geospatial regression and their potential as a geospatial regression modeling paradigm, providing a foundation for further refinement and evaluation.

How to cite: Yu, X., Hu, L., Su, C., Yan, Y., Wu, S., and Du, Z.: Long-term Soil Moisture Downscaling Based on Diffusion Models: Applicability Assessment of Generative Models for Geospatial Regression Tasks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5255, https://doi.org/10.5194/egusphere-egu26-5255, 2026.

EGU26-8056 | ECS | PICO | ESSI1.7

Global Sensitivity Analysis of Spatial Interpolation for Sparse, Clustered, and Censored Data: A Case Study of Groundwater Sulfate in the Paris Basin 

Corinna Perchtold, Jeremy Rohmer, Augustin Thomas, Julie Lions, and Martin Wieskotten

This study presents a comprehensive sensitivity analysis framework to disentangle the drivers of predictive uncertainty in spatial interpolation and how they ultimately affect spatial predictions. Developed within a Global Sensitivity Analysis context, the proposed approach is model-independent and generic, allowing for broad application across diverse spatial interpolation workflows.

The framework is demonstrated using groundwater Sulfate concentration in the Paris Basin, a dataset characterised by sparse and highly clustered sampling across six distinct aquifers according to the French "BD LISA" hydrogeological system (https://bdlisa.eaufrance.fr/). We represent the underlying spatial process as a  Gaussian Random Field, leveraging Integrated Nested Laplace Approximations for computationally efficient Bayesian inference. This allows for a  probabilistic treatment of uncertainty even within complex spatial structures.

We systematically evaluate the impact of several key uncertainty factors related to both data and model configuration: (1) the number of monitoring stations and their spatial distribution; (2) the selection of the environmental covariates  and the functional form of their effects (linear vs. non-linear); (3) the treatment of censored data (values below detection limits); and (4) structural assumptions regarding the spatial covariance function, specifically the estimation of variogram hyperparameters such as range, sill, and nugget effects and their prior specification. By propagating these uncertainty sources through our framework, we derive domain-wide aggregated sensitivity measures. These metrics quantify how specific data topologies—including sampling density, clustering effects, and censoring rates—govern the stability and accuracy of the resulting spatial interpolations.

Finally, the results facilitate an in-depth discussion on the limitations of purely probabilistic methods in data-poor scenarios. We provide an outlook on the potential of extra-probabilistic approaches, such as imprecise or interval-based kriging, to more robustly address the wide range of epistemic uncertainties inherent in environmental monitoring.

We acknowledge financial support of the French National Research Agency within the HOUSES project (grant N°ANR-22-CE56-0006).

How to cite: Perchtold, C., Rohmer, J., Thomas, A., Lions, J., and Wieskotten, M.: Global Sensitivity Analysis of Spatial Interpolation for Sparse, Clustered, and Censored Data: A Case Study of Groundwater Sulfate in the Paris Basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8056, https://doi.org/10.5194/egusphere-egu26-8056, 2026.

Predicting building attributes—such as functional classification, socioeconomic status, and energy efficiency—is a fundamental task in urban science. The current paradigm involves leveraging domain knowledge to extract attribute-specific morphological or topological features for supervised modeling. However, this heavy reliance on manual feature engineering often leads to task-specific models where features must be redefined for each attribute. Consequently, the field lacks a unified, generalizable framework capable of multi-attribute building prediction.

Inspired by recent advances in Regression Language Models (RLMs), which cast continuous prediction as a text-to-text task, we propose Buildings as Text (BaT). BaT serializes structured building representations (e.g., GeoJSON) into raw text and enables end-to-end text-to-text regression. To mitigate the spatial sensitivity of building data, we introduce a Topology-Preserved Coordinate (TPC) strategy that removes each building text’s absolute positional information. Specifically, TPC applies a global coordinate shift to the serialized geometry, suppressing absolute-location bias while preserving local shape and topology. By operating directly on raw text, BaT eliminates manual feature engineering and allows the model to learn a “spatial syntax” from the underlying geometric descriptions.

We validated the BaT framework through a case study on informal settlement (slum) classification. The results demonstrate that our model achieves superior performance and higher adaptability compared to traditional morphology-based methods. While validated on slum detection, this research offers a universal and scalable paradigm for urban building analysis, suggesting that Large Language Models can effectively "read" urban forms for diverse prediction tasks beyond specific domains.

How to cite: Wang, C.-C. and Luo, P.: Buildings as Text: A Universal Regression Paradigm for Building Attribute Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8402, https://doi.org/10.5194/egusphere-egu26-8402, 2026.

EGU26-11193 | ECS | PICO | ESSI1.7

A preliminary study of the morphology and spatial distribution of funerary elements in Oman 

Ana Sofia Meneses Pineda, Marco Solinas, Marco Ramazzotti, Massimo Musacchio, and Maria Fabrizia Buongiorno

The archaeological landscapes of northern Oman host thousands of funerary monuments of different periods and morphologies, forming one of the densest and least explored burial regions of the Arabian Peninsula. Within the framework of LAA&AAS (Laboratorio di Archeologia Analitica e Sistemi Artificiali Adattivi) and MASPAG (Missione Archeologica della Sapienza nella Penisola Arabica e nel Golfo), a multidisciplinary project supported by Sapienza University of Rome and the Italian Ministry of Foreign Affairs, we developed a reproducible geo-AI workflow to classify and analyse funerary structures based on remote-sensing and spatial-context information.

The first dataset, encompassing 185 tombs mapped in the Southwestern Cemetery near the village of Muslimat, in the region of Wadi al-Maʿawil (ca. 70 Km southwest of Muscat) was used to test a machine-learning pipeline designed to discriminate between morphological classes (“tombs” vs “non-tombs”, and within-type subclasses) from high-resolution satellite imagery and derived spatial metrics. Two Random Forest models were compared: a geometry-only baseline using shape descriptors (area, compactness, circularity, elongation), and an extended model incorporating spatial-context features such as kernel density, nearest-neighbour distances, Moran’s I local autocorrelation and cluster membership. The integration of these contextual descriptors increased overall accuracy from 59 % to 76 %, improving model reliability and reducing false positives in morphologically ambiguous contexts. The workflow includes systematic feature importance analysis and confusion-matrix evaluation to assess interpretability and class-imbalance effects.

Beyond the single-site test case, this approach aims to address a broader spatiotemporal challenge: learning and transferring morphological–contextual patterns across different archaeological regions. During 2025 field campaign (20 October – 20 December 2025), more than 500 new tombs were surveyed and georeferenced in the area of the Western Cemetery, expanding the available dataset and enabling large-scale testing of model scalability and transferability. This new phase will assess whether models trained in Wadi al-Maʿawil can generalize to nearby valleys with comparable geomorphological and cultural settings, supporting semi-automated mapping and predictive modelling of funerary features.

The presented pipeline, implemented in an open-source environment (Python, QGIS, and scikit-learn), is designed for reproducibility and transparent parameter tracking. All processing steps—from data preparation and feature extraction to model training and evaluation—are logged and versioned, facilitating cross-project reuse. The workflow thus bridges archaeological and geospatial domains, demonstrating how spatially aware machine learning can improve the detection, classification, and interpretation of complex cultural landscapes.

This contribution highlights the potential of AI and ML in managing spatiotemporal archaeological data and in advancing reproducible analytical frameworks. The methodological approach developed for the Omani funerary landscapes can be generalized to other MASPAG regions, supporting comparative analysis of desert landscapes and long-term dynamics of human–environment interaction across the Arabian Peninsula.

How to cite: Meneses Pineda, A. S., Solinas, M., Ramazzotti, M., Musacchio, M., and Buongiorno, M. F.: A preliminary study of the morphology and spatial distribution of funerary elements in Oman, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11193, https://doi.org/10.5194/egusphere-egu26-11193, 2026.

EGU26-11342 | PICO | ESSI1.7

Using Moran's I for assessing residual spatial autocorrelation in machine learning models  

Jakub Nowosad, Hanna Meyer, and Jonas Schmidinger

Understanding the spatial dependence of residuals is important for interpreting and diagnosing spatial machine learning models. Spatial autocorrelation in the residuals suggests that the model has not fully captured the data's spatial structure. This may imply that the model is missing crucial spatial context or interactions, and that, in effect, it is spatially biased, leading to underestimation in some areas and overestimation in others.

Moran's I is a commonly used statistic for the diagnosis of spatial autocorrelation in spatial predictions, providing a single-value quantitative measure with a straightforward interpretation. This measure quantifies the degree of spatial autocorrelation, indicating whether similar values are clustered together or dispersed across space. The information provided by Moran's I has been used in various ways in studies applying machine learning: to evaluate model performance, interpret results, understand model limitations, and compare different modeling approaches.

Unlike standard model performance metrics, such as R2 or RMSE, Moran's I depends not only on the values of residuals but also on the spatial contextespecially the study area's extent, the sampling strategy used, and the specification of spatial weights. However, there is a lack of a comprehensive understanding of how these factors influence the results of Moran's I calculation in the context of spatial machine learning, and of how to best use this measure for model evaluation and comparison.

Using simulated data with controlled spatial properties, we investigated how testing set size, sampling strategy, and the specification of spatial weights influence Moran's I computed on model residuals. Our results show that Moran's I, calculated based on k-nearest neighbors approach,  primarily reflects the spatial structure of values in the testing set rather than the residual autocorrelation across the full prediction domain, often underestimating fine-scale spatial patterns. These findings have various implications: weight-matrix definitions must be clearly reported, calculations on sparsely distributed or clustered samples should be avoided, Moran's I is generally not directly comparable across studies due to differences in spatial extents and sampling, and its values are inherently scale-dependent.

With this contribution, we aim to present the behavior of Moran's I calculated from residuals of spatial machine learning models under different conditions, outline best practices for selecting and reporting spatial weights, and discuss how to interpret Moran’s I. 

 

 

How to cite: Nowosad, J., Meyer, H., and Schmidinger, J.: Using Moran's I for assessing residual spatial autocorrelation in machine learning models , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11342, https://doi.org/10.5194/egusphere-egu26-11342, 2026.

EGU26-12275 | ECS | PICO | ESSI1.7

Area of Applicability for Deep Learning: Exploring Latent Space Geometry of Earth Observation Models 

Darius A. Görgen, Simon Heilig, Lara Meyn-Grünhagen, Asja Fischer, Johannes Lederer, and Hanna Meyer

Machine learning methods are used ubiquitously within the Earth Sciences to model spatio-temporal phenomena. These methods scale very well to big data sets and are used to model complex non-linear relationships between the predictor and outcome variables. Yet, most methods might silently fail when used in extrapolation scenarios, e.g. when combinations of predictor variables are encountered that have not been seen during training. This might be the case when the model is applied to new geographic areas that differ from the areas the model was trained on. For traditional machine learning models, estimating the area of applicability based on distances in the predictor space has been proposed. New inputs with distances above a certain threshold are rejected from prediction since our confidence in the model's output is low and we do not expect the estimated performance to hold.

Inspired by the success of deep architectures in the field of computer vision, the use of deep neural networks has been steadily increasing, especially in Earth Observation. Translating the concept of the area of applicability to deep architectures, however, remains a open research challenge. For the safe deployment of such models in the real world it is required to flag inputs for which we expect the model to extrapolate and is thus operating outside the estimated performance measure.

In this work, we are extending the concept of the area of applicability to deep neural network architectures. As an application rooted in current practices for Earth Observation, we use networks trained end-to-end for scene classification. We use these models as feature extractors to obtain representations of input samples in embedding space. We derive the area of applicability of the model within this space based on distances between training and calibration samples. For this purpose, we test different distance measures (euclidean, mahalanobis), leveraging the concept of KNN-distances, which also takes local point densities into account and test whether principal components of the embeddings improve the delineation of the area of applicability.

Our results highlight practical relevant trade-offs between different distance metrics operating in high-dimensional embedding spaces to derive the area of applicability for deep neural networks. The methodology presented can serve as a baseline to ensure the reliability of deployed models in safety critical applications.

How to cite: Görgen, D. A., Heilig, S., Meyn-Grünhagen, L., Fischer, A., Lederer, J., and Meyer, H.: Area of Applicability for Deep Learning: Exploring Latent Space Geometry of Earth Observation Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12275, https://doi.org/10.5194/egusphere-egu26-12275, 2026.

Near-real-time forest monitoring is a critical component for managing climate risk and resource conservation in the Brazilian Legal Amazon (ALB). The DETER program, managed by the National Institute for Space Research (INPE), has played a pivotal role for over two decades by producing warnings of deforestation and forest degradation to support environmental enforcement by agencies such as IBAMA. However, the effectiveness of these warnings is highly dependent on temporal efficiency — the speed at which a disturbance is detected and published after the activity begins.

Recent objective evaluations of DETER’s performance regarding selective logging — a major driver of forest degradation — revealed a significant median delay of approximately 312 days between the start of logging activities and the corresponding warning publication during 2022-2023. This temporal gap highlights the challenge of applying traditional monitoring to complex spatiotemporal datasets, where factors like cloud cover and sensor resolution can hinder early detection.

To address these challenges, this research proposes a novel approach within the framework of AI and Machine Learning in Spatiotemporal Contexts. We leverage Foundation Models and Deep Learning architectures designed to process the complex temporal dynamics of tropical forests using Harmonized Landsat-Sentinel (HLS) time series. A key contribution of using foundation models in this pipeline is their ability to learn robust representations from large-scale data, significantly reducing the requirement for vast volumes of manually annotated samples — a known bottleneck for AI-based remote sensing monitoring systems. By applying these models to HLS data, we aim to improve spatiotemporal predictions and the reliability of the modeling pipeline, facilitating the production of more agile and efficient early warnings.

This work contributes to the development of the next generation of forest monitoring systems, focusing on interpretability and transferability across the Amazonian landscape. By reducing the detection lag of selective logging, this approach seeks to enhance technological sovereignty in environmental monitoring and provide more effective decision-making support for forest preservation.

How to cite: Taquary, E. and Aragão, L.: Enhancing Near-Real-Time Forest Monitoring: Foundation Models and Harmonized Landsat-Sentinel (HLS) Time Series for Selective Logging Detection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16001, https://doi.org/10.5194/egusphere-egu26-16001, 2026.

EGU26-19025 | ECS | PICO | ESSI1.7

A framework for assessing the quality of spatial data applied in supervised image classification of deprived urban areas 

Florencio Campomanes V, Monika Kuffer, Alfred Stein, Anne M. Dijkstra, Lorraine Trento Oliveira, and Mariana Belgiu

The integration of Earth Observation (EO) data with machine learning (ML) has transformed the mapping of Deprived Urban Areas (DUA). Despite these technical advances, persistent disconnect remains between research outputs and their operational uptake by local stakeholders. In parallel, advances in ML and deep learning (DL), together with new satellite missions have improved the extraction of building footprints and urban morphology. Nevertheless, DUA mapping studies, which largely depend on these physical indicators, often prioritize benchmark performance over the robustness, transparency, or usability required in real-world decision-making contexts. One of the main reasons for this gap is spatial data quality (SDQ), which fundamentally limits model performance and generalization. When data quality is poor, due to inaccuracies, incompleteness, or inadequate provenance, models become unreliable, regardless of architectural complexity. Furthermore, many studies rely on validation strategies that ignore spatial autocorrelation, thereby yielding overoptimistic accuracy estimates that mask poor generalization to new local contexts.

To address these challenges, this paper argues for a shift toward a systematic assessment of spatial data quality. We first conduct a scoping review of 50 state-of-the-art DUA mapping studies published between 2017 and 2025. Our analysis reveals a high dependence on very-high-resolution imagery (72%), a widespread lack of publicly accessible data and code (92%), and a critical deficiency in operationalizing semantic definitions of DUAs with 90% of studies failing to provide mapping rules (for visual interpretation) or ground rules (for in-situ collection). Most studies also fail to assess user needs (90%) or do not consider the ethical implications of using DUA data (88%), which is highly sensitive due to risks such as forced evictions. Building on these findings and established international standards from ISO and the OGC, we propose a comprehensive Spatial Data Quality (SDQ) framework tailored to transparently document supervised image classification in DUA mapping. This framework integrates established practices such as adherence to the Findable, Accessible, Interoperable, Reusable (FAIR) principles and assessment of acquisition, measurement and spatial-temporal quality with novel dimensions addressing semantic consistency, sampling representativeness, human factors in annotation, learning shortcut risk, user needs validity, ethical considerations, and transparent reporting of the dataset’s potential failure modes or uncertainties. By operationalizing SDQ as a living, extensible framework, this work aims to better align advances in ML and DL with sustained societal impact, ensuring that DUA mapping products, or any relevant application domain, are fit for use by local communities and decision-makers.

How to cite: Campomanes V, F., Kuffer, M., Stein, A., Dijkstra, A. M., Trento Oliveira, L., and Belgiu, M.: A framework for assessing the quality of spatial data applied in supervised image classification of deprived urban areas, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19025, https://doi.org/10.5194/egusphere-egu26-19025, 2026.

EGU26-19057 | ECS | PICO | ESSI1.7

Satellite imagery for greenhouse mapping in Morocco using U-net model 

Said El hachemy, Chaima Aglagal, Hamza Ait-Ichou, Ilham Elhaid, Jawad Zlaiga, Mohammed Hssaisoune, Lhoussaine Bouchaou, and Salwa Belaqziz

Greenhouse agriculture has become a crucial element of agricultural practices in Morocco, yet its spatial and temporal evolution remain insufficiently quantified. This study aims to map greenhouse structures at the Souss-Massa region scale in order to assess the progress of covered agriculture and examine its relationship with socio-economic development in Morocco. Using hand-annotated greenhouse data from the Chtouka region as ground truth, we develop a deep learning–based detection framework relying exclusively on open-source tools. Multispectral Sentinel-2 satellite imagery at 10 m spatial resolution is used as input to a U-Net convolutional neural network, which is trained, validated, and tested for greenhouse segmentation. The proposed model achieves an overall accuracy of up to 94%, demonstrating strong generalization capability. The resulting plug-and-play methodology enables scalable, cost-effective, and open-source greenhouse mapping, and provides valuable insights into the dynamics of covered agriculture and its role in Morocco’s agricultural and socio-economic development.

How to cite: El hachemy, S., Aglagal, C., Ait-Ichou, H., Elhaid, I., Zlaiga, J., Hssaisoune, M., Bouchaou, L., and Belaqziz, S.: Satellite imagery for greenhouse mapping in Morocco using U-net model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19057, https://doi.org/10.5194/egusphere-egu26-19057, 2026.

EGU26-19452 | ECS | PICO | ESSI1.7

Improving Dengue Forecasting with Spatiotemporal Data Augmentation and Machine Learning 

Negar Siabi, Rackhun Son, Maik Thomas, Christopher Irrgang, and Jan Saynisch Wagner

Accurate forecasting of vector-borne diseases such as dengue is often challenged by limited and noisy spatiotemporal data. This study evaluates the effectiveness of data augmentation techniques in enhancing the robustness and predictive accuracy of machine learning models. We assess multiple augmentation strategies applied to weekly dengue case data across countries in South and Central America (2014–2022). Results show that augmentation substantially improves short-term forecasting performance, particularly in regions with sparse or irregular observations, yielding higher R² values and lower relative errors compared to non-augmented baselines. These findings demonstrate that well‑designed augmentation can mitigate data scarcity and strengthen the generalization of graph‑based deep learning frameworks for epidemiological forecasting. Overall, the study highlights augmentation as a practical and scalable approach for improving spatiotemporal ML applications in disease surveillance.

How to cite: Siabi, N., Son, R., Thomas, M., Irrgang, C., and Saynisch Wagner, J.: Improving Dengue Forecasting with Spatiotemporal Data Augmentation and Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19452, https://doi.org/10.5194/egusphere-egu26-19452, 2026.

EGU26-19669 | ECS | PICO | ESSI1.7

A Two‑Stream Spatiotemporal Architecture with Foundation‑Model Features Applied to Crop Classification 

João Gabriel Vinholi, Rim Sleimi, Florian Werner, and Albert Abelló

At continental scale, crop classification needs models that capture phenology through temporal analysis without degrading field boundaries. We introduce a decoupled architecture that uses static foundation‑model features across multi‑sensor time series and fuses them with high‑resolution spatial features. The temporal stream ingests paired multispectral and SAR sequences plus a static DEM and metadata, extracts foundation model token features per timestep, and compresses them with a Perceiver‑style bottleneck that cross attends from a fixed latent bank to the full foundation model token volume. Such heavy compression collapses sequence length by orders of magnitude, which makes longer temporal windows and larger batches ingestible on consumer‑grade GPU memory constraints while preserving the temporal signatures needed to separate crops with similar single‑date appearance.
The spatial stream stays purely static --- it selects a single high‑quality multispectral reference frame and passes it through a high‑resolution backbone to retain fine geometry and crisp boundaries. The two streams are joined in a query‑based decoder, where dynamic queries generated from the compressed temporal latents attend to multi‑scale spatial features, aligning phenological signatures with precise field edges. This fusion mechanism prevents coarse temporal features from blurring geometry and makes delineation robust to shifts in timing or crop management practice. In fact, temporal queries encode crop‑specific growth signatures, while the spatial stream supplies the pixel‑level evidence for boundary localization, whereas the decoder enforces instance‑aware segmentation through iterative cross‑attention and masked refinement.
We evaluate on EuroCrops crop‑class labels, achieving a Micro Recall of 84.1% and a Segmentation Quality of 84.2%. Transferability is tested with a spatial holdout protocol using geographically disjoint train/test regions, reliability is summarized by aggregate metrics on these strict splits, and uncertainty is communicated through per‑class performance variability and label‑noise sensitivity analyses that bound achievable scores.

How to cite: Vinholi, J. G., Sleimi, R., Werner, F., and Abelló, A.: A Two‑Stream Spatiotemporal Architecture with Foundation‑Model Features Applied to Crop Classification, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19669, https://doi.org/10.5194/egusphere-egu26-19669, 2026.

EGU26-20221 | ECS | PICO | ESSI1.7

Using Large Language Models to Enhance Spatial Data Discovery in Spatial Data Infrastructures 

James Okemwa Ondieki, Matthes Rieke, and Simon Jirka

Spatial Data Infrastructures (SDIs) contain a lot of spatial data from various organizations and data producers. Metadata is intended to enable the discovery of the data, yet finding the relevant data can be challenging. The challenges include rigid keyword-search, complex search interfaces in geoportals, map-based search that require some geographic knowledge, and language differences between user queries and the metadata.

The development of Large Language Models (LLMs) offers new opportunities to improve spatial data discovery. LLMs demonstrate strong language understanding and generation capabilities and have been used in information retrieval tasks. They can overcome semantic differences and language barriers between user queries and the needed information. However, their internal knowledge is limited and they are prone to hallucinations. Unless the datasets in SDIs, or the web pages describing them are indexed by search engines, LLMs with internet search tools cannot find them. 

Retrieval-Augmented Generation (RAG) offers a solution for the knowledge limitations, by connecting an LLM with an external and up-to-date knowledge base. However, RAG mainly works in the textual domain and excels at retrieving external information that is semantically relevant to a user query. Queries for geographic data have a spatial aspect yet the spatial reasoning capabilities of LLMs are limited. For a query like “forest data for Vienna”, RAG can identify the relevant forest data from a pool of metadata, regardless of the language or words used to describe the data. However, identifying datasets that meet the spatial intent is a problem. DCAT metadata, the most popular metadata standard, defines the spatial extent of spatial datasets using bounding box coordinates or as links to gazetteers. Naive RAG is based on semantic similarity approaches.  An LLM can identify “Vienna” as a location, but would struggle to identify datasets relevant to the location, as there is little semantic similarity between the location name and coordinate digits or gazetteer links.  There is thus a need to incorporate spatial indexing techniques for improved spatial reasoning.

With our contribution we present an approach that combines LLMs, RAG, and spatial indexing techniques to overcome existing challenges in discovering spatial data in SDIs, and improve spatial data discovery through natural language queries.

How to cite: Ondieki, J. O., Rieke, M., and Jirka, S.: Using Large Language Models to Enhance Spatial Data Discovery in Spatial Data Infrastructures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20221, https://doi.org/10.5194/egusphere-egu26-20221, 2026.

The recent rise of foundation models in Earth Observation (EO) has reshaped how remote sensing tasks are approached, particularly by allowing strong downstream performance with comparatively limited labeled data. These models have reported impressive results in applications such as land cover classification and semantic segmentation. However, performance gains alone do not resolve a central concern: whether the resulting predictions can be trusted. In practical EO scenarios—including disaster response and environmental monitoring—miscalibrated confidence estimates may lead to incorrect decisions even when overall accuracy appears high.

Motivated by this gap between accuracy and reliability, this study focuses on the uncertainty calibration behaviour of fine-tuned EO foundation models. Using TorchGeo for consistent data handling and the Lightning-UQ-Box framework for uncertainty quantification, we construct an evaluation pipeline that contrasts Vision Transformer–based pretrained models with conventional convolutional neural networks trained from scratch. Experiments are conducted across both image classification tasks (e.g., EuroSAT) and dense prediction settings such as semantic segmentation.

Rather than assuming superior representations automatically yield better-calibrated predictions, we explicitly examine how calibration properties change after fine-tuning large pretrained models. In addition, we evaluate a spectrum of uncertainty quantification approaches, from lightweight post-hoc methods like temperature scaling to more computationally demanding techniques, including Monte Carlo Dropout, deep ensembles, and Laplace approximation. Calibration quality is assessed using expected calibration error and reliability diagrams, alongside predictive accuracy.

By analysing the trade-offs between computational cost, accuracy, and calibration, this work provides practical insight into which UQ strategies are most effective for EO foundation models. Our findings aim to support the deployment of remote sensing systems in operational settings where reliable uncertainty estimates are as critical as raw predictive performance.

How to cite: Wei, Y.: Uncertainty Quantification for Earth Observation Foundation Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20813, https://doi.org/10.5194/egusphere-egu26-20813, 2026.

Localized rapid urban expansion and global climate change have contributed to land use and land cover (LULC) dynamic modifications, which further links to changed land surface temperatures (LST). This study proposes an integrated approach of machine learning (ML) models in assessing decadal LULC changes and future prediction in a city in the Mekong region. To achieve an accurate LULC map object-based classification strategies were implemented using various ML techniques across observed years with four main land cover categories: built-up areas, water bodies, paddy fields/shrubs, and orchards, together with LST extraction. The findings reveal that Random Forest classifier works superior to other classifiers, achieving the best overall accuracy of 81%. There have been substantial land usage changes, with the percentage of developed areas rising from 8% in 2014 to approximately 12% in 2024. Urbanization is correlated with rising temperatures , while, vegetation, on the other hand, helps alleviate this heat by providing shade and cooling. With an overall accuracy of 85% in the patch-generating land use simulation (PLUS) model, by 2030, under the impacts of both natural and socio-economic drivers, an apparent increase in the proportion of built-up areas to 15% and a slight variation in other categories could be seen in line with planning objectives. The urban expansion could be clearly seen in the highly dense districts with an increase to 42% by 2030 from an initial stage of merely 27% in 2014. The primary forecast conversions in LULC observed were vegetated lands transforming into construction areas for urbanization, yet maintaining agricultural practices for food security. The integrated approach has proven its suitability in intricate land usage patterns evaluation and optimization.

How to cite: Nguyen, L. and Daou, D.: Harnessing an Integrated Machine Learning based Approach in Monitoring and Predicting Dynamic Spatiotemporal Land Use and Land Cover Changes. A case study in a Mekong city , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21002, https://doi.org/10.5194/egusphere-egu26-21002, 2026.

EGU26-21265 | PICO | ESSI1.7

Spatially-explicit uncertainty assessment of ecosystem extent mapping 

Polina Tregubova, Sylvie Clappe, Ida Marielle Mienna, Bruno Smets, Marcel Buchhorn, Ruben Remelgado, and Carsten Meyer

Ecosystems are a key component of biodiversity, providing vital services to humans and the economy. Anthropogenic pressures driving environmental change result in widespread ecosystem degradation and loss. The area and spatial distribution of ecosystem types, referred to as ecosystem extent, provide a critical entry point for assessing ecosystem condition, functioning, and associated services, and therefore require detailed and spatially explicit monitoring.

Despite advances in geospatial analysis, consistent mapping and delineation of ecosystem extent remain challenging. Map products on ecosystem extent should, therefore, be supported by uncertainty assessments, ideally in a spatially explicit manner. According to best practices in related fields, the minimum requirement for uncertainty quantification for thematic maps is the aggregated estimation of per-class accuracy and per-class area uncertainty, following a validation procedure based on independent reference data. However, the standard practice remains spatially implicit. To date, there is no established practice for spatially explicit uncertainty quantification procedures.

This study presents a spatial solution for estimating the uncertainty of maps produced using machine-learning algorithms. The approach builds on the standard map-validation procedure and extends it to pixel-wise assessments using conformal prediction. While conformal prediction can be applied to any machine learning algorithm, ecosystem extent mapping poses domain-specific challenges, including a high-dimensional multi-class setting and hierarchical class structures. This study, therefore, focuses on developing solutions to ensure robust class-specific coverage, exploring different conformal prediction implementation variants, and adapting them from flat to hierarchical mapping scenarios.

To assess the feasibility and applicability of our approach, we tested it on the Oslo-Viken municipality in Norway. In this case study, we developed an ecosystem extent map for 2024 and quantified and mapped its uncertainty at pixel-level. This analysis helped to evaluate the practical application and performance of the approach on real-world cases.

 

How to cite: Tregubova, P., Clappe, S., Marielle Mienna, I., Smets, B., Buchhorn, M., Remelgado, R., and Meyer, C.: Spatially-explicit uncertainty assessment of ecosystem extent mapping, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21265, https://doi.org/10.5194/egusphere-egu26-21265, 2026.

EGU26-1994 | ECS | Posters on site | ESSI1.8

DT-Agro, a digital twin of the Greek AgroHydroSystem, development and testing 

Stergia Palli Gravani, Konstantinos Soulis, Xenofon Soulis, and Dionissios Kalivas

Developing operational Digital Twins for large-scale agro-hydrological systems presents significant challenges regarding data heterogeneity, computational efficiency, and the integration of Earth Observation (EO) with process-based modeling. This study presents the development and initial testing of DT-Agro, a spatially explicit Digital Twin of the Greek agro-hydro-system, designed to support sustainable water management and agricultural planning at the national scale.

DT-Agro integrates a high-resolution spatial database, a hybrid meteorological forcing scheme, and a distributed agro-hydrological model (AgroHydroLogos) recoded in C++ and Python for enhanced performance. A key innovation in the process simulation is the development of a novel, impervious-aware SCS-CN formulation. Unlike traditional lumped approaches, this method explicitly decomposes each grid cell into pervious and impervious fractions using high-resolution Copernicus Imperviousness Density data. This allows for a physically consistent representation of runoff generation in mixed landscapes, capturing the hydraulic response of small impervious patches that are often lost in standard gridded models.

Furthermore, to address the chronic fragmentation of ground-based monitoring networks, the system introduces a "virtual station" meteorological framework. Recognizing that raw global reanalysis products (e.g., AgERA5) often exhibit significant biases in Greece’s complex terrain, we developed a hybrid correction workflow. AgERA5 time series are sampled at the locations of historical stations and bias-corrected using station-specific regressions. This creates a network of "virtual stations" that provide continuous, homogenized daily records, filling temporal gaps while preserving local climatological characteristics. These records drive a dynamic spatial interpolation scheme that accounts for temperature and precipitation gradients, ensuring physically consistent meteorological forcing across the national domain.

We present results from the initial national-scale application of the system. The testing phase focused on quantifying irrigation water abstractions and their spatial-temporal drivers. Initial simulations estimate the long-term average national irrigation abstraction at approximately 6,600 hm³/year, with significant inter-annual variability (6,000–7,800 hm³) driven by climatic conditions. Validation against theoretical net irrigation requirements for major crops (maize, cotton, alfalfa) yielded consistent depths (380–420 mm), confirming the biophysical realism of the model core. These results demonstrate DT-Agro’s capability to provide a robust, evolving representation of the Greek agro-hydro-system for climate adaptation planning.

How to cite: Palli Gravani, S., Soulis, K., Soulis, X., and Kalivas, D.: DT-Agro, a digital twin of the Greek AgroHydroSystem, development and testing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1994, https://doi.org/10.5194/egusphere-egu26-1994, 2026.

EGU26-5404 | ECS | Posters on site | ESSI1.8

OBSALL: DestinE climate simulations in observation space 

Clément Bouvier, Lauri Tuppi, Jouni Räisänen, and Heikki Järvinen

Destination Earth has given rise to a new generation of global climate models capable of resolving kilometre-scale processes. These storm-resolving models are deployed operationally to produce multidecadal climate simulations within Climate Adaptation Digital Twin. OBSALL conceptualise and implemente an Earth observation-based system for monitoring the quality of the climate simulations. Technically, it is a one-pass algorithm in which observation models operate online on the state vector of the simulation model and generate a full-resolution trace in the observation space. This enables real-time monitoring of the simulation and posterior evaluation right after its completion, thus enhancing the overall resilience of the Climate DT workflow.

OBSALL has the projection and monitoring pipeline implemented for three observation modalities: surface variables with SYNOP weather stations, vertical profiles with TEMP radiosoundings, and satellite products with AMSU-A. OBSALL consumes climate model states in the form of stream of Generic State Vectors, projects the model states in to the observation space spanned by activated observation modalities, stores the projected states in Observation DataBase files and intermittently produces a suite of monitoring plots comparing the climate simulation to climatology of observations.  All modalities share the same general workflow structure facilitating implementation of new modalities or features. Finally, OBSALL has been designed to allow deployment on a range of environments from personal computer to HPC infrastructures.

This functionality will be part of the EU’s Climate Adaptation Information Service and it is intended to benefit Users of the Service, although it also holds strong research appeal. From the User perspective, presenting the simulation in observation space is advantageous. Earth observations, especially the in-situ components, are intuitive and easy to interpret. Therefore, observation-space projections provide a concrete handle on adaptation information and enhance the User relevance of the DestinE Climate DT. These projections tend to lower the threshold for users to engage with the Information Service. Therefore, we see future development of user-oriented tools using the projection data as input as a promising strategy to attract new users. Extending the projection to other user-relevant observation types, such as long-term wind mast measurements, would be beneficial.

Here, we showcase the current capabilities of the climate model projection and monitoring software OBSALL from the view-points of runtime simulation monitoring through different observation types, and observation-based posterior validation, which offers a versatile way to validate process-level features in storm-resolving climate models.

How to cite: Bouvier, C., Tuppi, L., Räisänen, J., and Järvinen, H.: OBSALL: DestinE climate simulations in observation space, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5404, https://doi.org/10.5194/egusphere-egu26-5404, 2026.

This contribution presents the system architecture, data pipelines, and modelling logic at a conceptual level for a Digital Twin prototype named JÄÄTwin. The aim of JÄÄTwin is to integrate near-real-time observations from multiple heterogeneous data sources, with numerical models, and computational infrastructure into a live “river ice condition” digital twin. While large-scale initiatives focus on continental and global domains, digital twin implementations at the river scale remain limited, specifically for cryo-hydrological systems. JÄÄTwin integrates in-situ monitoring data, remote sensing products, and meteorological forecast inputs through a modular backend that supports data ingestion, preprocessing, model orchestration, and visualization. The emphasis of this contribution is on system architecture, data integration logic, and operational workflow design.

The Kiiminkijoki River in northern Finland is used as a pilot to demonstrate how a river-scale digital twin can be implemented using existing monitoring infrastructure. The presentation discusses design choices, integration challenges, and transferability considerations relevant to future Earth system digital twin developments and to emerging European digital twin initiatives.

 

How to cite: Jalali Shahrood, A.: River-Scale Digital Twin for Cryo-Hydrological Systems: Architecture and Integration Principles from the JÄÄTwin Framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6018, https://doi.org/10.5194/egusphere-egu26-6018, 2026.

EGU26-6871 | Posters on site | ESSI1.8

Simulating the human dimension in Destination Earth. An EO-Informed digital twin application for climate-adaptive policy planning 

Charalampos Paraskevas, Georgios Gousios, Theano Mamouka, Paraskevi Vourlioti, Dimitrios Kasampalis, Stylianos Kotsopoulos, and Claudia Vitolo

As Destination Earth (DestinE) matures, the capability to simulate not just natural phenomena but also the "related human activities" becomes critical for delivering actionable insights on sustainable development. This work presents TRANSITION, an operational Digital Twin application designed to model the complex socio-environmental dynamics of land-use change, renewable energy integration, and agricultural sustainability within the DestinE ecosystem.

While traditional Earth system digital twins excel at forecasting physical variables (e.g., crop yields or solar irradiance), they often lack the behavioral fidelity to predict how human actors will respond to these changes. TRANSITION bridges this gap by integrating Earth Observation (EO) data with a Multi-Level Agent-Based Modelling (ML-ABM) system driven by Reinforcement Learning (RL). In this framework, autonomous agents—representing farmers, landowners, and policymakers—make spatially explicit decisions based on environmental suitability, economic incentives, and social factors (PECS framework).

We demonstrate the application of this digital twin through three core stakeholder-co-designed use cases:

  • Climate Change Adaptation Strategies: Simulating long-term land-use shifts under various CMIP climate scenarios to identify regions at risk of agricultural abandonment or suitable for crop diversification.
  • Green Credit & Policy Simulation: allowing policymakers to "stress-test" interventions—such as subsidies for photovoltaics (PV) or green credits—in a risk-free virtual environment to assess adoption rates and potential conflicts between food and energy production.
  • Renewable Energy Optimization: Utilizing Sentinel-derived analytics and high-resolution Digital Elevation Models (DEMs) to identify optimal deployment zones for renewable infrastructure while accounting for socio-economic acceptance.

 

To ensure these insights are scalable and actionable, the application is architected to eventually run on Destine’s Platform, with the potential to utilize High-Performance Computing (HPC) for heavy agent training and the DestinE Data Lake for seamless access to Sentinel and ERA5 datasets. By coupling high-precision physical modeling with realistic human behavior, TRANSITION offers a robust decision-support tool for evidence-based policymaking, directly contributing to the European Green Deal’s vision of a resilient and adaptive society.

How to cite: Paraskevas, C., Gousios, G., Mamouka, T., Vourlioti, P., Kasampalis, D., Kotsopoulos, S., and Vitolo, C.: Simulating the human dimension in Destination Earth. An EO-Informed digital twin application for climate-adaptive policy planning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6871, https://doi.org/10.5194/egusphere-egu26-6871, 2026.

EGU26-8395 | Orals | ESSI1.8

Lessons learned from the pilot phase of the Digital Twin Component for Glaciers project (DTC Glaciers) 

Fabien Maussion, Julia Bizon, Nicolas Gampierakis, Noel Gourmelen, Livia Jakob, Carolyn Michael, Thomas Nagler, Samuel Nussbaumer, Patrick Schmitt, Gabriele Schwaizer, and Michael Zemp

Mountain glaciers are critical elements of the Earth’s hydrological and climate systems. The rapid changes in glaciers due to climate change hinder our ability to monitor and address associated risks effectively. To address these challenges, we present the Digital Twin Component for Glaciers (DTC Glaciers), part of ESA’s Digital Twin Earth (DTE) programme. In this presentation, we will demonstrate the early prototype of the DTC Glaciers system, developed through close co-design with our stakeholders in the hydropower and water sectors. The demonstration will highlight current capabilities, including regional glacier mass-balance assessment, runoff estimation, and user-informed scenarios. We will also share key lessons learned from Phase 1, focussing on data assimilation of heterogeneous datasets, the practicalities of building adaptive architectures, and the challenges of meeting diverse user needs within a unified framework. Despite these challenges, DTC Glaciers offers a well defined test case to assess the transformative potential of digital twins in climate risk assessments.

How to cite: Maussion, F., Bizon, J., Gampierakis, N., Gourmelen, N., Jakob, L., Michael, C., Nagler, T., Nussbaumer, S., Schmitt, P., Schwaizer, G., and Zemp, M.: Lessons learned from the pilot phase of the Digital Twin Component for Glaciers project (DTC Glaciers), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8395, https://doi.org/10.5194/egusphere-egu26-8395, 2026.

EGU26-10081 | Orals | ESSI1.8

AI4Clouds: Enhancing Short-Term Cloud Fields in DestinE for the Solar-Energy Sector 

Fernando Iglesias-Suarez, Markel García, Ignacio Heredia, Antonio Perez, Sergio Portilla, Judith Sáinz-Pardo, and Daniel San-Martin Segura

The AI4Clouds project develops a deep-learning-based enhanced short-term (up to 12 h) cloud fields for the Destination Earth (DestinE) Weather-Induced Extremes Digital Twin (Extremes DT). By fusing high-resolution Extremes DT simulations with EUMETSAT satellite observations (SEVIRI / FCI), the system learns to correct systematic model biases and provide enhanced cloud-related fields, such as cloud cover, optical depth, and top height fields, which are key variables for renewable-energy and weather applications.

AI4Clouds follows a multi-stage training strategy: it first pre-trains on ERA5 reanalysis data collocated with satellite datasets to capture large-scale dynamics, then fine-tunes on Extremes DT forecasts, also collocated with satellite datasets. It employs stretched-grid Graph Neural Network–Transformer architectures implemented within ECMWF’s Anemoi framework. Probabilistic forecasts are produced via an ensemble approach that quantifies aleatory and epistemic uncertainty. All data retrieval, preprocessing, training, and serving workflows are deployed on the DestinE Data Lake using its HDA, ISLET, and Stack services, ensuring reproducibility and operational integration through MLOps pipelines.

Validation relies on the open-source AQUA framework, extended with cloud-forecast and deep-learning diagnostics (e.g., RMSE, bias, CRPS, spectral metrics). An industrial partner from the solar-energy sector provides user-driven evaluation across several Iberian sites.

By integrating Earth-observation data, high-resolution numerical forecasts, and deep learning within DestinE’s infrastructure—a cloud-native environment—AI4Clouds demonstrates a scalable path toward building actionable applications for weather-sensitive sectors.

How to cite: Iglesias-Suarez, F., García, M., Heredia, I., Perez, A., Portilla, S., Sáinz-Pardo, J., and San-Martin Segura, D.: AI4Clouds: Enhancing Short-Term Cloud Fields in DestinE for the Solar-Energy Sector, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10081, https://doi.org/10.5194/egusphere-egu26-10081, 2026.

EGU26-13491 | ECS | Posters on site | ESSI1.8

Climate and Environmental Digital Twins for Human Health: Leveraging Earth Observation for Compound Climate and Air Quality Extremes Early Warning  

André Villa de Brito, Ana Oliveira, Bruno Marques, Caio Fonteles, Élio Pereira, Fabíola Silva, Inês Girão, Luís Figueiredo, Marcelo Lima, Rita Cunha, Maria Oliveira, Paulo Nogueira, Bruno Santos, Rosa Trancoso, and Vital Teresa

Climate resilience is a defining challenge of the 21st century, yet public health authorities continue to face difficulties in operationalising state-of-the-art geospatial and environmental science. In Portugal, as in Europe more broadly, extreme temperatures have already increased in frequency and severity, contributing to substantial excess mortality and morbidity. These impacts are often amplified by the simultaneous degradation of air quality. However, evidence has largely been event-specific, fragmented across case studies of individual heatwaves, cold waves, or air-quality episodes, limiting our ability to implement early-warning systems. The ESA-funded AIR4health project, developed under the Early Digital Twin Components initiative, addresses these gaps by designing innovative algorithms to predict human mortality and morbidity during compound extreme events. The project develops two Machine Learning (ML)–based AIR4health Risk Algorithms focusing on (1) Heat & Ozone and (2) Cold & Nitrogen Dioxide, using a two-decades-long, high-resolution healthcare database for mainland Portugal. These indicators will integrate EO data, in-situ air-quality records from the EEA, and CAMS/C3S model outputs. Satellite and model data are dynamically downscaled using approaches previously demonstrated for air-temperature modelling in Lisbon, enabling daily, spatially detailed (municipal-level) time series of compound extreme events. AIR4health advances beyond current country-level systems by implementing fully spatiotemporal exposure–response modelling. Its dynamic and continuous framework will deliver a prototype DTC capable of providing fine-scale early warning for combined climate and air-quality extremes. By benchmarking results against European-level datasets, AIR4health will support scalable pathways towards relevant practices in planetary health and climate-preparedness, while contributing to the broader European Digital Twin ecosystem. 

How to cite: Villa de Brito, A., Oliveira, A., Marques, B., Fonteles, C., Pereira, É., Silva, F., Girão, I., Figueiredo, L., Lima, M., Cunha, R., Oliveira, M., Nogueira, P., Santos, B., Trancoso, R., and Teresa, V.: Climate and Environmental Digital Twins for Human Health: Leveraging Earth Observation for Compound Climate and Air Quality Extremes Early Warning , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13491, https://doi.org/10.5194/egusphere-egu26-13491, 2026.

EGU26-13552 | ECS | Orals | ESSI1.8

Frost event detection and impact modelling within the Destination Earth On-Demand Extremes framework     

Michael Matějka, Lenka Hájková, Martin Možný, Adéla Musilová, and Vojtěch Vlach

Spring-time frost events pose a significant risk for the agricultural sector. Vineyards, apricots and other crops might be damaged by freezing temperatures after vegetation season onset. Frost events occur commonly during cold outbreaks in April or May after prolonged periods of relatively high air temperatures in late-winter or early-spring. The damage can be significantly reduced by suitable measures, provided a reliable forecast is available. As frost intensity is often highly spatially variable, its forecast can benefit from hectometric-scale numerical atmospheric modelling. We present results related to frost events detection and impact modelling within the Destination Earth On-Demand Extremes (DEODE) project. The detection scheme uses the ECMWF ensemble forecasts of 2-m air temperature and cloud cover to identify regions of potential high-impact frost events. The Global Digital Twin 2-m air temperature sums since the start of the year are used to delimitate areas where vegetation is not active yet. These areas are masked from the risk assessment. Finally, the frost damage risk is expressed in 0–5 scale and may serve as guidance for hectometric-scale simulation domains. After a hectometric simulation is completed, the output is ingested into a downstream frost impact model. The impact model estimates current development phase of several crops and corresponding temperature thresholds for frost damage. These thresholds are compared with hectometric-scale forecasted temperatures to obtain the magnitude of excess of critical temperature for each crop. The impact model has been evaluated during several pilot frost events in the Czech Republic and Spain.

How to cite: Matějka, M., Hájková, L., Možný, M., Musilová, A., and Vlach, V.: Frost event detection and impact modelling within the Destination Earth On-Demand Extremes framework    , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13552, https://doi.org/10.5194/egusphere-egu26-13552, 2026.

EGU26-14321 | Orals | ESSI1.8

Operational Sea Ice Information as a Digital Twin 

David Arthurs, Lasse Rabenstein, Tyna Dolezalova, Thomas Puestow, Till Rasmussen, and Anton Korosov

Operating in the polar regions is growing increasingly important due to expanding research needs, emerging economic opportunities and efforts to protect national sovereignty.

Ships operating in the Arctic and Antarctic face heightened risks and more severe consequences in the event of an accident due to sea ice, icebergs, harsh sea states, low visibility and extreme temperatures. These hazards are compounded by sparse infrastructures and remoteness, with immediate assistances not readily available during emergencies.

While climate change is quickly reducing the amount of sea ice, this does not necessarily translate to a reduction in risk in the short-term. On the contrary, uncertainly due to changing sea ice conditions, and an increase in icebergs due to melting glaciers, can increase risk.

The operational sea ice community provides information to lessen risk to life, property and the environment and to improve operational efficiency. That information is used by ships to avoid or navigate through sea ice, by ship operators in planning polar voyages, and by policy makers to assess the impact of climate change on future decisions regarding polar operations. Beyond safety, these services also increase operational efficiency by enabling optimized routing, reduced fuel consumption, and shorter transit times.

The information provided by the operational sea ice community comes from in-situ measurements, satellite earth observation data, and sophisticated models of the atmosphere, oceans, sea ice, and icebergs. These data streams have historically been assembled and interpreted by highly trained human analysts. Given the rapid increase in the amount of data that needs to be analyzed and the constraints on workforces due to government fiscal reductions, their work is increasingly being assisted by artificial intelligence. Furthermore, because sea ice is highly dynamic and can change within a matter of hours, automation and AI are indispensable for providing real-time and forecast information.

The operational sea ice workflows have many of the attributes of an Earth System Digital Twin:

  • Utilization of detailed digital models of relevant earth systems (atmosphere, ocean, sea ice),
  • Continuous incorporation of near-real-time data from in-situ sensors and earth observation satellites,
  • Use of artificial intelligence and machine learning,
  • Generation of predictive models and forecasts, and
  • Provision of advance analytics and decision support tools to allow end-users to optimize their choices.

In fact, much of the information both used and generated by the operational sea ice community is available in Destination Earth, the initiative of the European Commission to develop a digital twin of the Earth.

This presentation will examine the provision of operational sea ice information as an example of the application of digital twins to provide actionable insights on climate adaptation and disaster risk reduction. It will present current capabilities, AI-enabled workflows, and lessons learned from operational implementation, with a focus on supporting safe and sustainable polar shipping.

How to cite: Arthurs, D., Rabenstein, L., Dolezalova, T., Puestow, T., Rasmussen, T., and Korosov, A.: Operational Sea Ice Information as a Digital Twin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14321, https://doi.org/10.5194/egusphere-egu26-14321, 2026.

EGU26-17401 | Posters on site | ESSI1.8

DestinE Platform: Transforming Earth System Science into Actionable Services 

Martina Lo Iacono, Calogera Tona, Matteo Cortese, and Barbara Scarda

Destination Earth (DestinE) is Europe’s flagship initiative for developing high-precision digital twins of the Earth system, enabling the simulation, monitoring, and prediction of natural phenomena and human activities. Coordinated by ECMWF, ESA, and EUMETSAT, DestinE brings together advanced Earth system models, comprehensive Earth observation data, Artificial Intelligence, and Europe’s leading supercomputing infrastructure to deliver actionable, decision-ready information for a wide range of users. Its overarching goal is to support climate adaptation, disaster risk reduction, and sustainable development by translating complex scientific outputs into practical tools. 

DestinE is built around modular Digital Twin components representing the atmosphere, oceans, land, and human activities. These components provide high-resolution simulations, near-real-time monitoring, and predictive analytics that can be combined and tailored to specific user needs. AI-driven methods enhance forecast skill, detect emerging risks, and reveal cascading impacts across environmental and socio-economic systems, enabling users to better anticipate and manage complex challenges. 

A core pillar of the DestinE Platform is its onboarding process, designed to ensure accessibility, usability, and long-term engagement. Onboarding supports users from first contact through to operational use, offering guided access to the platform, clear documentation, asynchronous support channels and videos to provide a flexible introduction to the onboarding integration process. User needs and levels of expertise are assessed early in the process, allowing stakeholders to be directed toward the most relevant Digital Twin components, data services, and interfaces. This structured approach enables users to progressively build capacity, moving from exploration and testing to confident use of DestinE services in real-world decision-making contexts. 

The DestinE Platform provides a suite of integrated, user-oriented services, including: 

  • Real-time environmental monitoring, delivering continuous updates to support situational awareness and rapid hazard assessment. 
  • Scenario-based simulation services, allowing users to explore short- and long-term developments such as extreme weather events, floods, climate adaptation pathways, or the impacts of human activities. 
  • Predictive analytics and early warning tools, using AI to anticipate risks and support timely, informed responses. 
  • Interactive decision-support interfaces, enabling users to explore data, customize simulations, and test policy or management options in a virtual environment. 
  • Co-design and customisation services, closely linked to onboarding, which allow users to adapt DestinE capabilities to regional, sectoral, or organizational contexts, supported by ongoing expert guidance. 

DestinE builds on and connects with complementary European initiatives, including ESA’s Digital Twin Earth programme, Horizon Europe projects, and national efforts, ensuring scientific robustness and operational relevance. 

This abstract invites contributions that showcase user experiences with DestinE, including onboarding pathways, co-design approaches, and practical applications such as emergency planning, urban resilience, or marine and hydrological management. Emphasis is placed on how user engagement and onboarding enable the effective translation of advanced Earth system science into actionable insights. 

By focusing on users, services, and onboarding, DestinE demonstrates how digital twins can empower stakeholders, support evidence-based decisions, and strengthen societal resilience in the face of environmental and climatic challenges. 

How to cite: Lo Iacono, M., Tona, C., Cortese, M., and Scarda, B.: DestinE Platform: Transforming Earth System Science into Actionable Services, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17401, https://doi.org/10.5194/egusphere-egu26-17401, 2026.

EGU26-17698 | Orals | ESSI1.8

Advancing the EO-based Geohazard Digital Twin from Prototype to Pre-operational Platform 

Gaetana Ganci and Salvatore Stramondo and the GET-IT team

The  GET-it project (Geohazards Early Digital Twin Component) is devoted to build a Digital Twin for geohazards in the framework of the ESA Early Digital Twin Component (DTC) programme. The Geohazard Digital Twin is transitioning from a demonstration prototype to a shareable, pre-operational Earth Observation (EO)-based service platform. This transition has been initiated through the integration of the GET-it scenario modules within the Geohazards TEP (https://geohazards-tep.eu), a long-standing operational platform developed under ESA, and will continue through their future integration and federation with the Destination Earth (DestinE) framework. The project explores how EO data can effectively drive DTCs for volcanic and seismic geohazards within the Destination Earth framework.

The Geohazard DTC developed in GET-it is designed as a modular and customizable environment, capable of integrating multi-sensor EO data, primarily from the Copernicus programme, with established physical models and EO-driven analysis tools. The adoption of standardized input and output formats ensures interoperability and facilitates the uptake of DTC products by diverse user communities with different operational needs. These characteristics enable repeatable, timely EO-driven simulations and facilitate the integration of the Geohazard DTC into downstream pre-operational workflows.

The current prototype includes several scenario modules operating at increasing levels of EO data exploitation: GEOMOD, for modelling EO-derived geodetic signals; FALL3D, for volcanic ash and SO₂ dispersion constrained by EO observations; GPUFLOW, for lava flow modelling based on EO-derived effusion rates and topography; and DAMSAT, for EO-based change and damage detection. Together, these modules act as building blocks for pre-operational services, empowering stakeholders to explore realistic emergency scenarios and assess potential mitigation and adaptation strategies. 

A central aspect of GET-it is the systematic integration of stakeholder requirements into the DTC design. A structured engagement process, based on questionnaires and direct interactions, involved a broad range of public and private stakeholders, including civil protection authorities, aviation and transport stakeholders, infrastructure managers, insurance companies, energy providers, and decision-makers. All the scenario models were considered relevant, with FALL3D emerging as the most requested service. The collected feedback directly informed the definition of the scenario modules and the organization of a dedicated Demonstration Day, focused on the validation of EO-driven what-if scenarios.

The Geohazard DTC has been demonstrated through representative multi-hazard use cases, including the 2018 Mount Etna eruption, the 2021 La Palma eruption, and the 2016 Central Italy earthquake sequence. Based on these activities, a medium- to long-term roadmap is being defined, focusing on enhanced EO-driven simulations, advanced decision-support tools, interoperable visualization interfaces, scalable services, and extension to additional geohazards, in alignment with other DTC initiatives and community-building actions. This roadmap aims to consolidate the Geohazard DTC as a sustainable pre-operational platform, ready for future operational uptake.

How to cite: Ganci, G. and Stramondo, S. and the GET-IT team: Advancing the EO-based Geohazard Digital Twin from Prototype to Pre-operational Platform, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17698, https://doi.org/10.5194/egusphere-egu26-17698, 2026.

EGU26-18330 | Posters on site | ESSI1.8

AIDE: Trusted Deep-Learning Super-Resolution for MTG FCI within Destination Earth 

Adrian Fessel and Daro Krummrich

Meteorological imagery occupies a special role in Earth observation: its high temporal frequency and broad spectral coverage are indispensable for weather forecasting and climate modelling, while its spatial resolution remains limited. Technological advances, however, are driving a trend toward higher spatial detail and open new application domains beyond traditional meteorology.

The Flexible Combined Imager (FCI) aboard Meteosat-12 represents the latest generation of geostationary weather sensors and images the entire Earth disk at 10-minute intervals. Its 16 spectral channels span the visible to longwave infrared and offer native spatial resolutions ranging from 2000 m down to 500 m — a configuration well suited to super-resolution techniques.

The AIDE project develops a methodology to increase the spatial resolution of all FCI channels to up to 500 m while preserving the radiometric integrity of the data. The approach employs a purpose-built deep-learning model that is augmented with an estimate of its own predictive uncertainty, thereby enabling safe downstream use of the enhanced products in demanding applications and quantitative analyses. The method is implemented as a demonstrator within the Destination Earth (DestinE) Data Lake, demonstrating that appropriately designed machine-learning approaches can be deployed reliably in critical operational contexts.

This contribution presents the concept and development of the method, summarizes promising project results, and discusses the limitations and potential of super-resolution approaches in Earth observation.

How to cite: Fessel, A. and Krummrich, D.: AIDE: Trusted Deep-Learning Super-Resolution for MTG FCI within Destination Earth, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18330, https://doi.org/10.5194/egusphere-egu26-18330, 2026.

EGU26-19009 | Orals | ESSI1.8

Urban Heat Health Forecasting with Destination Earth: Leveraging Digital Twins and Machine Learning for Scalable Urban Climate Services 

Inês Girão, João Paixão, Maria Castro, Vítor Miranda, Fabíola Souza Silva, Stephan Siemen, Claudia Di Napoli, and Ana Patrícia Oliveira

Extreme heat is increasingly recognized as one of the most severe climate-related risks affecting urban populations, with disproportionate impacts on public health, energy systems, and vulnerable communities. As heatwaves intensify under climate change, cities require near-real time, high-resolution and actionable information to support early warning systems, preparedness, and long-term adaptation. Addressing this challenge at urban-local scale demands not only methodological innovation, but also robust digital infrastructures capable of delivering consistent and interoperable climate intelligence across regions.

Destination Earth (DestinE), a strategic initiative of the European Union, represents a transformative step in this direction by providing global, high-resolution climate and weather simulations, through Digital Twins of the Earth System. By coupling advanced numerical models, Earth Observation (EO) data, and high-performance computing, DestinE establishes a common backbone for next-generation climate services. However, translating these powerful datasets into locally relevant, operational products for cities remains a critical challenge.

DE_395-Urban Heat Health Forecasting (UHHF) project addresses this gap by demonstrating how DestinE Extremes Digital-Twin outputs can be transformed into urban-scale user-oriented heat-health indicators through the operational use of Machine Learning (ML). The project applies ML-based downscaling techniques to near-surface air temperature (T2m) and relative humidity (RH) forecasts, enhancing spatial resolution from kilometre-scale to approximately 200 m. These downscaled fields are subsequently used to derive human-biometeorological indicators such as the Universal Thermal Climate Index (UTCI) and Thermal Stress Duration (TSD), supporting health-oriented risk assessment.

The UHHF framework integrates DestinE atmospheric drivers with EO-derived and geospatial predictors describing urban form, land cover, vegetation, and topography, including Local Climate Zones. Quality-controlled crowdsourced observations from citizen weather stations are combined with WMO reference data to constrain and validate the ML models, ensuring robustness under both average and extreme conditions. The approach is being implemented across four climatically and socio-environmentally diverse Functional Urban Areas, e.g. Naples, Chicago, Santiago, and Cape Town, enabling a systematic evaluation of models across continents.

By building directly on DestinE and complementary European programmes led by ECMWF, ESA, and Copernicus, drawing on both their data assets and operational services, UHHF aims to illustrate how these can be leveraged to develop affordable, scalable, and reproducible urban-scale climate information and services. The project highlights the strategic importance of climate data platforms in bridging the gap between global simulations and local decision-making, contributing to the development of interoperable urban climate and health services aligned with European and international resilience frameworks.

How to cite: Girão, I., Paixão, J., Castro, M., Miranda, V., Souza Silva, F., Siemen, S., Di Napoli, C., and Oliveira, A. P.: Urban Heat Health Forecasting with Destination Earth: Leveraging Digital Twins and Machine Learning for Scalable Urban Climate Services, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19009, https://doi.org/10.5194/egusphere-egu26-19009, 2026.

EGU26-19040 | Orals | ESSI1.8

Pangeo-Fish and the Global Fish Tracking System: Scaling Biologging Analytics with Earth System Digital Twins for evidence-based policy support 

Tina Erica Odaka, Etienne Cap, Quentin Mazouni, Corentin Hue, Jean-Marc Delouis, Mathieu Woillez, Anne Fouilloux, Benjamin Ragan-Kelley, and Daniel Wiesmann

The Global Fish Tracking System (GFTS) and Pangeo-Fish integrate biologging data with high-resolution environmental data in a digital-twin framework to address key challenges in marine conservation and fisheries management. Linking fish movement models with climate projections from Europe’s Destination Earth (DestinE) Climate Change Adaptation digital twin yields an evidence-based tool for decision support in habitat conservation and fisheries management under climate change. The implementation is built on the open-source Pangeo ecosystem and deployed on the DestinE platform.

Pangeo-Fish is an open-source package that ingests multiple biologging data types—including archival tags, pop-up satellite archival tags (PSATs), and acoustic telemetry detections. It processes time series observed by fish (e.g., depth, temperature, and light) together with geolocation constraints derived from external sources such as acoustic receiver networks and tag-based positioning. These heterogeneous observations are harmonised in a cloud-native workflow to support scalable track reconstruction and downstream habitat-relevant products.

Tracks are reconstructed using a Hidden Markov Model (HMM) geolocation approach that combines tag-recorded time series (e.g., depth, temperature, and light) with external geolocation constraints (e.g., acoustic detections), together with priors such as bathymetry and release/recapture information. Processing leverages cloud-native tools (Jupyter, Dask, Xarray) and chunked cloud-optimised storage (Zarr) for scalable analysis. A key design choice is the use of HEALPix as the base spatial grid and indexing scheme from ingestion to visualisation, enabling efficient path-likelihood evaluation on an equal-area, iso-latitude grid while avoiding distortive resampling. Environmental reference fields are primarily sourced from the Copernicus Marine Service, but the workflow can also ingest user-defined datasets. In addition, in situ observations (e.g., Argo float temperature) can be incorporated to represent uncertainty in the ocean physics fields used by the geolocation model and better account for model–data discrepancies.

The Pangeo-Fish workflow yields most-probable tracks and daily presence-probability maps, while GFTS supports aggregation of distributions for management-relevant analyses. GFTS intersects these outputs with climate projections from the DestinE Climate Change Adaptation Digital Twin to assess potential habitat exposure under climate change. GFTS has so far been demonstrated for Atlantic applications, while Pangeo-Fish has been extended to the Pacific Ocean, enabled by portable cloud-native processing and the availability of cloud-accessible datasets. As an early DestinE platform use case, this work illustrates how Earth system digital twins can be operationalised for reproducible, scalable biologging analytics to inform marine conservation and sustainable fisheries management.

How to cite: Odaka, T. E., Cap, E., Mazouni, Q., Hue, C., Delouis, J.-M., Woillez, M., Fouilloux, A., Ragan-Kelley, B., and Wiesmann, D.: Pangeo-Fish and the Global Fish Tracking System: Scaling Biologging Analytics with Earth System Digital Twins for evidence-based policy support, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19040, https://doi.org/10.5194/egusphere-egu26-19040, 2026.

EGU26-19782 | Orals | ESSI1.8

Integrating Flood-Induced Mobility Disruption Modelling into CityNexus as a New Advanced Application Service on Destination Earth 

Alessandra Feliciotti, Alejandra Lizama, Ludovico Lemma, Anika Ruess, Mattia Marconcini, Andreas Altenkirch, Josselin Stark, and Simone Fratini

Urban mobility and air quality are tightly coupled in cities: travel demand, network performance, and the spatial distribution of activities shape transport-related emissions and accessibility, and therefore underpin policy instruments such as low-emission zones, speed regulation, network reconfiguration, and land-use adjustments. Urban transport systems, however, operate within a broader set of constraints that extend beyond traffic and emissions management. Flooding and other water-related disruptions are a recurrent and increasingly relevant challenge for cities, with the potential to severely affect the transport network functionality. These challenges are often addressed through separate analytical tools, reflecting isolated approaches to mobility, environmental quality, and climate resilience topics. This fragmentation limits the ability of decision-support systems to effectively evaluate interventions across interrelated domains, motivating the need for modular and extensible services capable of integrating multiple processes within a single analytical framework.

CityNexus Pro, an Advanced Application Service (AAS) onboarded on Destination Earth, addresses this need through an operational and modular architecture designed to support integrated and cross-cutting scenarios analysis. The service builds on CityNexus, initially developed to support scenario-based assessment of mobility patterns, transport-related emissions, and air quality, and extends this baseline by incorporating an interoperable module for modelling mobility disruptions caused by flooding.

Within CityNexus Pro, urban mobility dynamics are represented by coupling data-driven origin–destination estimation with a traffic simulation engine, generating high-resolution spatio-temporal traffic flows. These flows are translated into emission estimates and linked to air quality models to quantitatively assess pollutant concentrations under alternative urban scenarios. Model assumptions, data sources, and parameterisations are explicitly documented to ensure transparency and reproducibility. Flood modelling outputs derived from hydrodynamic simulations based on the SFINCS model are mapped onto road network elements and incorporated into the mobility simulation chain, enabling dynamic modification of traffic conditions through configurable speed-reduction and road-closure thresholds.

The service, which already supports a range of policy-relevant scenarios—including low-emission zones implementation, speed limit changes, partial or full road inaccessibility, land-use reallocations, and shifts in mobility demand—is further complemented by compound scenario analysis in which flood hazards and mobility-related policy interventions are evaluated jointly. Users can configure flood scenarios by adjusting parameters such as precipitation timing and duration, river and sea discharge, and the presence of flood defences. In addition, CityNexus Pro is designed to integrate forecast products from the DestinE Digital Twin for Weather-Induced Extremes, enabling the impact assessment of extreme rainfall and flooding on mobility patterns and accessibility. 

CityNexus Pro is operational in Copenhagen, Seville, Bologna, and Aarhus, and is currently being deployed in Bucharest and Vitoria-Gasteiz. The service demonstrates how modular urban analytics on Destination Earth can be incrementally enhanced to address compound climate risks and support non-siloed, policy-relevant scenario analysis for urban resilience.

How to cite: Feliciotti, A., Lizama, A., Lemma, L., Ruess, A., Marconcini, M., Altenkirch, A., Stark, J., and Fratini, S.: Integrating Flood-Induced Mobility Disruption Modelling into CityNexus as a New Advanced Application Service on Destination Earth, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19782, https://doi.org/10.5194/egusphere-egu26-19782, 2026.

EGU26-22126 | Posters on site | ESSI1.8

Satellite-based monitoring solution for the potato industry 

Josef Pichler, David Kolitzus, and Peter Santbergen

The satellite-based monitoring solution operated by GEO4A B.V. and powered by GeoVille GmbH is commercially called HARVIC. This abbreviation stands for “Harvest in control”. HARVIC offers comprehensive in-season crop growth monitoring and evaluation at the field level, tailored specifically for potato crops in central Europe, with the potential for global expansion. This service provides a detailed, continuous assessment of potato growth dynamics throughout the growing season, using high-resolution satellite (Sentinel-1 & Sentinel-2) and meteorological data as input to the crop growth module.

The main benefit of the HARVIC service is an EO data-supported transition of stakeholders in the potato industry towards digitalisation. Critical decisions can be made based on objective and timely data. HARVIC therefore identifies and evaluates critical growth stages, including emergence, vegetative growth, maturity, and senescence, ensuring that users receive timely updates and insights into potatoes' unique development cycle. The service also provides yield forecasting and quality information for each growing stage to improve logistics and reduce costs, as well as unnecessary greenhouse gas emissions.

The HARVIC service has been selected as a high-priority agricultural service for onboarding and implementation within the European Commission’s Destination Earth (DestinE) initiative. The basic service is published, and registered users of the DestinE community are eligible to take advantage of this unique solution for the potato industry. Furthermore, the consortium of GeoVille GmbH and GEO4A B.V. is about to include monitoring services to track and trace regenerative agricultural practices, such as cover cropping and tillage. In addition, the latest climate data (ECMWF) will be used to enhance HARVIC, potentially delineating future growing regions for potato cultivation in the coming decades.

How to cite: Pichler, J., Kolitzus, D., and Santbergen, P.: Satellite-based monitoring solution for the potato industry, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22126, https://doi.org/10.5194/egusphere-egu26-22126, 2026.

EGU26-22287 | Orals | ESSI1.8

The Desert Locust Monitoring Service: a new Destination Earth service for Environmental Pest prediction 

Sabrina Outmani, Alessandro Grassi, Wassim Azami, Kimani Bellotto, and Maximilien Houël

Desert locusts are known as the world’s most destructive migratory pest. A single swarm can count 80 million locusts, traveling up to 150 km daily and consuming the same amount of food as 35.000 people per day. They are cause of major mid-to-long-term impacts on the economy, quality of life and the environment. Climate change is amplifying the occurrence of such pests: the increase of extreme events such as cyclones creates ideal conditions for locust breeding.

The Desert Locust Monitoring Service (DLMS) is the new system for preventing upsurges of desert locusts across Eastern Africa, Southwest Asia and northwest of India. It leverages the power of satellite observations, model outputs, in-situ measurements and advanced AI techniques to monitor the locust threat, help mitigate crop damage, and safeguard essential food supplies.

Onboarded on the Destination Earth (DestinE) platform, the service consists of two layers: the Early-Stage Locust Appearance and the Locust Swarm Migration. The first layer relies on a customized Maxent model, a statistical approach widely used in species distribution modeling (SDM): it allows identifying areas with favorable environmental conditions for desert locust breeding. There are two available modes: the Safe Mode makes use of 50 days of climate data from ERA5-Land (soil water content, precipitation, and temperature) and normalized difference vegetation index (NDVI) from Sentinel-3 to generate daily probability forecasts; the Experimental Mode extends forecasts to two days thanks to the advanced projections of DestinE’s Weather Extremes Digital Twin. The variables used are: skin temperature, total precipitation and runoff.
The second layer forecasts adult locust swarm migration under biological and climatic conditions, accounting for their rapid and unpredictable long-distance movements. It takes as primary input the Early-Stage layer output, which provides initial location predictions under specific environmental conditions. Environmental variables, including the Leaf Area Index (LAI) from Copernicus, as well as wind velocity components and temperature from DestinE’s Climate Change Adaptation Digital Twin, inform the model of conditions that may trigger migration events. Swarm behavior is then represented using a stochastic model, which simulates an environment-biased random movement on a 2D lattice, generating batches of diverse potential scenarios. In this framework, locusts move based on environmental cues, including climate conditions and the availability of resources, such as vegetation. Finally, the model performs a statistical analysis across all generated scenarios to produce output maps estimating future locations of adult locusts and the size of their swarms.
Both Maxent and stochastic models were trained using a presence-only dataset provided by FAO’s Locust Watch. Prediction results have been validated with FAO data on desert locust activity and independent data provided by the International Center of Insect Physiology and Ecology (ICIPE).

Enabled by Destination Earth’s cutting-edge modeling and data infrastructure, the DLMS offers high-quality insights to anticipate risks, mitigate impacts and support the protection of crops, communities, and ecosystems in the most affected regions. Forecasting maps are available to all DestinE registered users, with some features reserved to users with upgraded access.

How to cite: Outmani, S., Grassi, A., Azami, W., Bellotto, K., and Houël, M.: The Desert Locust Monitoring Service: a new Destination Earth service for Environmental Pest prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22287, https://doi.org/10.5194/egusphere-egu26-22287, 2026.

EGU26-22325 | Orals | ESSI1.8

Using Earth Observation Informed Agent-Based Models to build a Scenario Planning Digital Twin for Local Energy Policies 

David J. Wagg, Nicolas Malleson, Alejandro Beltran, Matthew Tipuric, and Daniel Arribas-Bel

Policy makers at local, national, and international levels are increasingly being required to make decisions that mitigate the effects of climate change on society and the economy. Earth Observations (EO) are already an important source of data to support such decisions, but this data represents only one aspect of the broader socio-technical systems that decision makers seek to influence. Policy effectiveness depends not only on environmental conditions, but also on household behaviour, technology adoption decisions, economic constraints, and feedbacks across scales. Capturing these dynamics requires modelling approaches that explicitly represent human decision-making alongside EO-derived inputs. This paper will present the results from the development of a scenario-planning digital twin (SPDT) designed to support decision-making processes related to local energy policies. The new SPDT will demonstrate how EO datasets can be integrated with multilevel agent-based models (MABMs) to enable specific scenarios to be used to support policy decisions.

 

The use case in this work enables policy makers to (i) model residential heating demand, (ii) test policy levers that might best encourage the uptake of low-carbon heating, and (iii) assess the implications for energy use and fuel poverty. Specifically, the MABM simulates hourly residential energy demand for space heating at the household level, accounting for building characteristics, policy levers, occupancy patterns, retail energy prices, and external ambient temperature. The MABM supports baseline demand estimation at fine spatial granularity (individual households), the assessment of new technologies (such as retrofit measures or heating controls/meters) and energy price variations, and counterfactual analysis (e.g., setpoint shifts, tariff changes, warm/cold snaps). Earth observation data is used to inform medium to long-term climate trends for the use case region. The project results presented in this paper have been developed to serve the city of Newcastle upon Tyne in the UK. We discuss results for Newcastle developed so far, and possible new future scenarios that could be developed using this type of methodology.

How to cite: Wagg, D. J., Malleson, N., Beltran, A., Tipuric, M., and Arribas-Bel, D.: Using Earth Observation Informed Agent-Based Models to build a Scenario Planning Digital Twin for Local Energy Policies, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22325, https://doi.org/10.5194/egusphere-egu26-22325, 2026.

EGU26-22710 | Posters on site | ESSI1.8

High-Resolution Thermal Mapping and Simulation Scenarios for Land Cover Intervention Planning 

Hugo Poupard, Franco Fernandez, Guillermo Gonzalez Fradejas, and Fabien Castel

The UrbanSquare service within Destination Earth aims to deliver an operational digital twin of cities, enabling urban planners to explore and assess environmental processes and intervention scenarios. One key component of this digital twin is the representation of the urban heat island (UHI) effect. UrbanSquare relies on land surface temperature (LST) observations derived from thermal satellite imagery, primarily Landsat data resampled and distributed at 30 m spatial resolution.

However, actionable urban digital twins require both finer spatial detail and the ability to simulate “what-if” scenarios driven by land-cover change. In particular, UHI mitigation planning calls for high-resolution thermal information (well below 30 m) and dynamic coupling between land-cover configurations and surface temperature responses.

We present a three-stage framework for generating scenario-ready LST maps at 5-meter resolution.

In Stage 1, a Random Forest model upscales Landsat LST from 30 m to 5 m using Sentinel-2 spectral bands and indices (B2, B3, B4, B8, B11, B12, NDVI, albedo), solar geometry variables, and ERA5 meteorological predictors. Sentinel-2 data are harmonized to 30 m for model training, then applied at super-resolution (5 m) for inference. Cross-validation assesses predictive performance in the absence of in situ measurements.

Stage 2 applies pixel-wise linear regression between the super-resolved LST time series and synchronous ERA5 air temperature across a summer period. This normalization removes temporal and meteorological variability and enables LST generation for user-defined air-temperature scenarios, ensuring consistent thermal comparisons.

Stage 3 constructs a lookup table of thermal signatures for each land cover class. When users modify land cover, pixels are reassigned the corresponding thermal signature. A diffusion process accounts for lateral heat dispersion, producing delta-temperature maps, uncertainty layers, and decomposed contributions from different factors (vegetation, albedo increase, etc.).

Soon to be integrated into the UrbanSquare digital twin, this framework enables exploration, comparison, and quantification of UHI mitigation strategies, supporting evidence-based urban planning and climate adaptation decisions.

How to cite: Poupard, H., Fernandez, F., Gonzalez Fradejas, G., and Castel, F.: High-Resolution Thermal Mapping and Simulation Scenarios for Land Cover Intervention Planning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22710, https://doi.org/10.5194/egusphere-egu26-22710, 2026.

EGU26-22715 * | Posters on site | ESSI1.8 | Highlight

CALIFE, a service to discover the quality of life in your neighbourhood 

Fabien Castel, Cyprien Lavigne, Reda Semlal, Matuta Cauneau, and Laure Pialot

CALIFE (Certification of LIfestyle and Environment) is an operational service designed to make Earth observation–derived environmental information directly accessible to citizens. The service delivers hyper-local “quality of life” reports for any address, synthesising complex Earth observation (EO), in-situ and model data into a clear, visual and actionable format. CALIFE indicators cover key dimensions of everyday living conditions, including air quality, climate and weather comfort, water and vegetation resources, biodiversity and access to green spaces, mobility and accessibility, and exposure to selected climate hazards. Results are presented through intuitive scores, maps and short recommendations, allowing users to understand how their local environment influences well-being.

A core ambition of CALIFE is to address the general public rather than professional or expert users. The service targets non-specialists with no background in EO, climate science or geospatial data. CALIFE explores how highly technical datasets—such as Copernicus services and Destination Earth digital twins—can be transformed into information that is meaningful at the scale of daily life. By lowering technical barriers and focusing on usability, CALIFE represents an attempt to put Earth observation “into the hands of citizens”, fostering environmental awareness, transparency and informed decision-making at neighbourhood level.

Beyond technical and scientific challenges, CALIFE also serves as a laboratory for experimenting with sustainable business models for citizen-oriented EO services. The long-term viability of such services remains a key open question. CALIFE initially explored a micro-payment model for individual users, but has since evolved towards a hybrid approach: free access for the general public with controlled usage (limited number of reports per user), combined with paid bulk report generation for public authorities and private actors requiring large-scale analyses. This contribution will discuss both the service design choices and the lessons learned regarding sustainability, scalability and citizen engagement when deploying EO-based services for non-expert audiences.

How to cite: Castel, F., Lavigne, C., Semlal, R., Cauneau, M., and Pialot, L.: CALIFE, a service to discover the quality of life in your neighbourhood, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22715, https://doi.org/10.5194/egusphere-egu26-22715, 2026.

EGU26-1492 | Orals | ITS1.11/ESSI1.10

A multi-modal semi-supervised model for ocean sediment lithology 

John M. Aiken, Dunyu Liu, William Gilpin, and Thorsten Becker

Earth science data are typically highly heterogeneous which leads to mixed determined inverse problems and poses challenges to extract process-level information. For example, ocean sediment cores from the International Ocean Discovery Program (IODP) contain hundreds of millions of measurements across multiple geophysical properties, but usable datasets are only 5-10% complete due to missing data. We present a semi-supervised variational autoencoder with masked encoding that simultaneously imputes missing measurements and predicts lithology, enabling more complete utilization of legacy IODP archives. We train a masked variational autoencoder on the LILY database (89 km of core, 34 million observations, 42 IODP missions) to learn joint distributions across bulk density, magnetic susceptibility, RGB reflectance, and natural gamma ray attenuation. The model uses selective masking during training to learn imputation strategies for missing modalities. Crucially, the learned latent representations are constrained to recover lithological labels from unseen cores without retraining. We demonstrate that the model both captures the nonlinearities contained in the training data and is able to reconstruct the test data (R2_avg=0.86) and that data lithology (AUC_avg=0.9), while also providing descriptive embedding vectors (ARI=0.2). Additionally, the underlying data contains strong non-linear relationships that are not captured by simpler models on reconstruction (e.g., a typical LASSO-based regression (R2=0.24)). Our work represents a step towards scalable cross-modal assimilation and representation of existing earth datasets.

How to cite: Aiken, J. M., Liu, D., Gilpin, W., and Becker, T.: A multi-modal semi-supervised model for ocean sediment lithology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1492, https://doi.org/10.5194/egusphere-egu26-1492, 2026.

This presentation details the "Enhancement Digital Twin Application" initiative led by the United Nations Office for Outer Space Affairs (UNOOSA) and UN-SPIDER. Designed to support the United Nations' "Early Warnings for All" (EW4All) agenda, this project leverages cutting-edge space technologies to bolster disaster resilience in Small Island Developing States (SIDS). A primary challenge in the Pacific region is the significant "data gap" - specifically the lack of building footprints that include height information, which is critical for accurate disaster modeling. While global datasets exist, they often lack vertical data, and regional initiatives like PCRAFI have limited coverage. To bridge this gap, this project utilizes 30cm high-resolution satellite imagery combined with Deep Learning AI models to construct cost-effective 3D Digital Twins. The methodology employs advanced techniques, including NeRF and Gaussian Splatting, to generate models ranging from LOD1 (for GIS analysis) to LOD3 (for high-fidelity visualization). The core of the presentation focuses on the "Tonga Disaster Preparedness Platform," a pilot project implemented in 2024. This platform integrates these 3D geospatial models with real-time environmental data from IoT rain gauges and water-level sensors installed on the ground. This fusion enables precise, real-time simulations of sea-level rise and flood scenarios. A key innovation is the system's ability to optimize evacuation routes dynamically; by analyzing real-time flood depth data, the digital twin can identify safe passage corridors and update evacuation directions instantly, a capability that static hazard maps cannot provide. Finally, the presentation outlines the roadmap for expanding these capabilities to the Cook Islands and the Republic of Palau. It demonstrates how satellite-derived digital twins can revolutionize the entire Disaster Risk Management (DRM) cycle - spanning prevention, mitigation, response, and recovery - providing a scalable, data-driven framework for climate adaptation in vulnerable "big ocean" states.

How to cite: Takami, J.: Enhancement Digital Twin Application for Disaster Management by UNOOSA/UN-SPIDER, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1580, https://doi.org/10.5194/egusphere-egu26-1580, 2026.

EGU26-1624 | ECS | Posters on site | ITS1.11/ESSI1.10

Towards Digital Twins: Uncertainty and Sensitivity Analysis for Safety-Case Modelling 

Alexandra Duckstein, Solveig Pospiech, Vinzenz Brendler, Frank Bok, Raimon Tolosana-Delgado, Elmar Plischke, and Mostafa Abdelhafiz

Deep geological repositories rely on robust, transparent, and scientifically based safety concepts to ensure the long-term safety of radioactive waste. As safety cases become increasingly data-rich and computationally integrated, Digital Twins are emerging as a powerful tool to represent, test, and communicate the behavior of complex geosystems over geological timescales. A core requirement for such Digital Twins is the explicit quantification of parameter uncertainties and sensitivities, ensuring that the model is both reliable and efficient in reproducing key safety functions.

In this contribution, we introduce a workflow designed to assess uncertainties and sensitivities associated with radionuclide retention in geological host formations. Our approach combines geostatistical as well as geochemical simulation and global sensitivity analysis. Mineralogical heterogeneity is represented using geostatistical realizations generated through custom Python implementations of Markov-chain methods and truncated Gaussian random field simulations, producing spatially realistic mineral distributions. These mineralogical scenarios are then propagated through a geochemical modelling step using Geochemist Workbench, in which the distribution coefficient (Kd) is computed for each realization to quantify the effect of mineralogical and geochemical variability on uranium retention.

To identify the key indicators of variability, the workflow incorporates variance-based sensitivity analysis (SA) based on a custom Python toolbox. The SA reveals both first- and second-order effects, highlighting the influence of individual parameters on the resulting Kd values as well as pairwise parameter interactions. In almost all cases, the identified sensitivities and interactions can be explained by underlying chemical and physical processes. Additionally, this approach enables targeted dimensionality reduction, a critical step for constructing Digital Twins that maintain scientific robustness while remaining computationally tractable.

The workflow is presented for crystalline host rocks, where we focus on uranium retention within granitic systems governed by solid–liquid interactions: sorption, aqueous speciation, precipitation, and dissolution. A key advantage of our workflow is its modular structure. Each component, geostatistical simulation, geochemical modelling, and sensitivity analysis, can be independently adapted, extended, or replaced. This makes the framework readily transferable to other host rocks such as salt or clay, which exhibit fundamentally different retention mechanisms, as well as to other radionuclides with distinct sorption, solubility, or redox characteristics.

Our results highlight (i) the magnitude of uncertainty introduced by mineralogical heterogeneity, (ii) the non-linear sensitivity of uranium retention to coupled mineral–solution systems, and (iii) the potential to substantially reduce model complexity by focusing on a small subset of high-impact parameters. Overall, the workflow provides a structured and scalable method for quantifying uncertainties and identifying the parameters most relevant to long-term safety. In this way, it provides the essential, uncertainty-aware input data required for the generation of reliable and computationally efficient Digital Twins in geological disposal scenarios.

How to cite: Duckstein, A., Pospiech, S., Brendler, V., Bok, F., Tolosana-Delgado, R., Plischke, E., and Abdelhafiz, M.: Towards Digital Twins: Uncertainty and Sensitivity Analysis for Safety-Case Modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1624, https://doi.org/10.5194/egusphere-egu26-1624, 2026.

EGU26-1780 | Orals | ITS1.11/ESSI1.10

Developing a new Digital Twin for Destination Earth: Technical Progress of TerraDT in its First Year 

Narayanappa Devaraju, Jenni Kontkanen, Jenni Poutanen, Juha Tonttila, Hendryk Bockelmann, Hauke Schmidt, Nikolay Koldunov, Daniel Klocke, Etienne Tourigny, Maria Giuffrida, Harri Kokkola, Thomas Zwinger, Mario Acosta, Anton Laakso, and Sara Garavelli

High-resolution, kilometer-scale information on regional climate impacts is critical for effective adaptation and mitigation strategies. The European Commission’s Destination Earth (DestinE) Climate Adaptation Digital Twin (Climate DT) aims to address this need; however, actionable impact assessments remain limited by incomplete representation of key Earth system components and their interactions. The Horizon Europe funded TerraDT project tackles these limitations by developing a state of the art Digital Twin focused on the cryosphere, land surface, aerosols, and their coupled processes, fully interoperable within the DestinE ecosystem.

TerraDT pursues three objectives: (1) build and deploy new Digital Twin Components (DTCs) to strengthen process realism and enable impact assessments; (2) deliver a modular, scalable, interoperable platform integrating advanced software, high-performance computing, and data workflows that can host physical models and Artificial Intelligence (AI)/Machine Learning (ML) emulators; and (3) foster user uptake through early engagement and a User centric Interface (UI).

In its first year, TerraDT achieved several milestones:

  • Cryosphere: A prototype Land-Ice DTC was established by coupling Elmer/Ice with ICON climate model via YAC coupler, supported by curated glacier dynamics datasets. Development of the Sea-Ice DTC (FESIM) began in mid-2025, including YAC-mediated coupling and an AI sea-ice emulator capable of ~100-day to multi-year rollouts, producing smoother fields than physical models. 
  • Land Surface: A prototype time-varying land use dataset was generated for ECland and ICON land surface models. 
  • Aerosols: A simplified Aerosol DTC was tested, with integration into (open) Integrated Forecasting System (IFS). ML components were prototyped in HAM-LITE to capture advanced aerosol physics (e.g., hygroscopicity) at reduced computational cost.

Impact modelling advanced across multiple domains:

  • Sea-ice: Assessments of ice season duration, severe condition probabilities.
  • Forest: Integration of 3PG and Prebasso models, calibration across European ecosystems, ML emulation of Prebasso, and characterization of old-growth forests.
  • Urban: A carbon-sequestration emulator validated in Helsinki, with planned extensions to Lisbon, Barcelona, Munich, Paris, and Zurich. Key data sets required are prepared in combination with ML methods, and will be applied to build advanced Urban impact models for assessing climate extremes.

Infrastructure and interoperability were strengthened through YAC based coupling (ICON-Energy Balance Firn Model-Elmer/Ice on LUMI and Levante Supercomputers), and Sea Ice DTC I/O plans were aligned with DestinE workflows. A map-based UI architecture was designed to expose high resolution impact assessments for decision support.

By advancing new DTCs, AI/ML emulators, and generic coupling interface, TerraDT is being developed for full integration into the DestinE framework, ensuring compatibility and enhancing the overall ecosystem’s capability to inform climate adaptation and mitigation strategies. This presentation will summarize first year progress, outline objectives, and present the roadmap toward fully coupled simulations, validation, and dissemination of impact indicators through TerraDT UI for policy and stakeholder communities.

How to cite: Devaraju, N., Kontkanen, J., Poutanen, J., Tonttila, J., Bockelmann, H., Schmidt, H., Koldunov, N., Klocke, D., Tourigny, E., Giuffrida, M., Kokkola, H., Zwinger, T., Acosta, M., Laakso, A., and Garavelli, S.: Developing a new Digital Twin for Destination Earth: Technical Progress of TerraDT in its First Year, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1780, https://doi.org/10.5194/egusphere-egu26-1780, 2026.

EGU26-3381 | ECS | Posters on site | ITS1.11/ESSI1.10

An Initial Digital Twin Architecture for Long-Term Radionuclide Transport Modeling in Deep Geological Repositories 

Smruthi Ravichandran, Solveig Pospiech, Vinzenz Brendler, and Guido Juckeland

The long-term safety of Deep Geological Repositories (DGRs) requires rigorous assessments capable of predicting radionuclide transport over million year timescales. While the Digital Twin (DT) concept offers a robust framework for such assessments, the traditional requirement for bidirectional, real time communication is currently unfeasible due to the absence of active physical repositories. We propose a modular DT prototype application framework designed to evolve from a high fidelity simulation environment into a fully synchronized system as field data emerge.

At its core, this framework utilizes standardized data schemas to harmonize heterogeneous, site-specific field data from crystalline host rock including mineral composition, pore water chemistry, and surface properties. These standardized datasets are integrated via a specialized API into a modular orchestration pipeline that connects 1D and 2D fracture simulations with reactive transport codes such as PHAST, OpenGeoSys, and PFlotran. By containerizing these secondary physics models into Docker environments, the framework ensures high computational flexibility and reproducibility. This approach allows for the seamless integration of Machine Learning models and complex physics-based workflows while maintaining isolated execution environments.

Acknowledging the post-closure reality of a DGR where sensors may fail or lose power supply this framework prioritizes the characterization of source term evolution (radionuclide fluxes) through a "build fill close abandon" logic. Current  focus is on building features to establish resilient data formats and interface protocols to create a future proof foundation for geological safety. We demonstrate how containerization and robust interface design can transform divergent research projects into a unified, reproducible DT framework, applicable to any domain where long term predictive modeling is required despite limited real-time data.

How to cite: Ravichandran, S., Pospiech, S., Brendler, V., and Juckeland, G.: An Initial Digital Twin Architecture for Long-Term Radionuclide Transport Modeling in Deep Geological Repositories, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3381, https://doi.org/10.5194/egusphere-egu26-3381, 2026.

EGU26-3775 | Posters on site | ITS1.11/ESSI1.10

Developing a Digital Twin to support the resilience of young trees to drought 

Steffi Urhausen, Deborah Hemming, Deanne Brettle, Emma Ferranti, and Sarah Greenham

The EU CARMINE project (https://carmine-project.eu/) aims to support urban and surrounding metropolitan communities to become more climate resilient. The project focuses on heat, wildfires, flooding, pollution and drought across eight case study areas in Europe. Birmingham, located within the West Midlands Combined Authority (WMCA), serves as the UK case study area. High priority climate hazards for Birmingham are extreme heat, as well as pluvial flooding caused by extreme precipitation events. Increasing urban tree cover to alleviate these hazards could be a promising nature-based solution. However, a large amount of newly planted trees tends to wilt or die due to drought stress.

To assist the Council and community volunteers in maintaining the young trees during drought events, we are developing a digital twin framework to identify when and where young trees across Birmingham need watering. Common indicators include daily plant available water and the level of drought/wetness for the last few weeks. These indicators are based on soil moisture content, usually at different depths. Unfortunately, such measurements are sparse or absent in urban areas. We use the Joint UK Land-Environment Simulator (JULES) model forced by the UK weather forecasting model UKV at a spatial resolution of 1.5km, to estimate soil moisture content. Using machine learning techniques, we emulate JULES outputs to provide soil moisture estimations in a faster, more efficient and more flexible way. Platforms developed through the CARMINE project allow us to communicate the need for watering to interested communities. This approach is an important step to support communities and city authorities to improve the management of urban trees and resilience of cities to climate hazards like heat waves and flooding.

This approach explores how a digital twin, combined with an emulation of JULES soil moisture using ML techniques, could provide drought information for young trees more efficiently. It has the potential to scale beyond the case study area of Birmingham and transfer the digital twin to other urban areas.

How to cite: Urhausen, S., Hemming, D., Brettle, D., Ferranti, E., and Greenham, S.: Developing a Digital Twin to support the resilience of young trees to drought, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3775, https://doi.org/10.5194/egusphere-egu26-3775, 2026.

EGU26-5493 | ECS | Posters on site | ITS1.11/ESSI1.10

Geo-AI-Based Assessment of 3-D Ultrafine Particle Distribution and Population Exposure: A Digital Twin Approach in Taichung, Taiwan 

Chia-Wei Hsu, Jun-Jun Su, Rui-Zhen Yang, Candera Wijaya, Yu-Cheng Chen, Shih-Chun Candice Lung, Ta-Chih Hsiao, Chao-Hung Lin, and Chih-Da Wu

This study developed a Geospatial Artificial Intelligence (Geo-AI)–based framework to estimate and visualize the three-dimensional (3-D) distribution of ultrafine particles (PM₀.₁) and associated population exposure across Taichung City, Taiwan. An unmanned aerial vehicle (UAV) platform equipped with a P-Trak Ultrafine Particle Counter was deployed to collect high-resolution 3-D PM₀.₁ concentration data across varying altitudes and land-use types. These 3-D PM₀.₁ data were integrated with multi-source geospatial datasets, including 3-D building models, meteorological variables, and emission inventories. The SHapley Additive exPlanations (SHAP) method was then employed to identify key predictors for machine-learning modeling. The optimized model was applied to map the continuous 3-D pollution field and used to estimate and visualize population exposure for each floor level. The resulting Geo-AI model achieved strong predictive performance, with R² values of 0.95 for training and above 0.85 for validation, demonstrating high robustness and predictive capability. Visualizations reveal a nonlinear vertical structure of PM₀.₁ in 3-D space, characterized by near-ground peaks in industrial and traffic zones alongside persistent localized hotspots at mid-to-high elevations. Population exposure assessments highlighted that, despite lower concentrations at higher elevations, the total exposure burden remains significant in mid-to-high-rise residential buildings due to higher population density. This research presents an advanced framework for assessing 3-D air pollution exposure risks in dense urban environments, demonstrating the potential of Digital Twin technologies in supporting air quality management and public health decision-making.

How to cite: Hsu, C.-W., Su, J.-J., Yang, R.-Z., Wijaya, C., Chen, Y.-C., Lung, S.-C. C., Hsiao, T.-C., Lin, C.-H., and Wu, C.-D.: Geo-AI-Based Assessment of 3-D Ultrafine Particle Distribution and Population Exposure: A Digital Twin Approach in Taichung, Taiwan, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5493, https://doi.org/10.5194/egusphere-egu26-5493, 2026.

EGU26-6409 | ECS | Posters on site | ITS1.11/ESSI1.10

stTwin: A digital twin framework for catchment-scale sediments transport 

Qi Zhou, Hui Tang, Jacob Hirschberg, and Fabian Walter

Sediment transport is a fundamental process shaping landscapes and posing significant hazards in mountainous regions. However, traditional field monitoring and simulation approaches, such as grain size sampling and numerical modeling, are often costly and time consuming. Recent advances in physics-based models and machine learning have substantially improved spatial and temporal resolution. These achievements enable the development of digital twins to explore what-if scenarios and to better understand the dynamic processes involved.

In this work, we combine the probabilistic sediment cascade model (SedCas) with the machine learning–based event detection model (Flow-Alert) to develop a digital twin of a catchment. The former relies solely on climate forcing to simulate sediment dynamics, whereas the latter uses seismic signals to identify extreme sediment transport events, such as debris flows. We address three key questions. First, how to design a digital twin framework that captures the physical components of sediment transport, including erosion on hillslopes, hillslope to channel transfer, and channel transport to the catchment outlet, at hourly and even sub hourly temporal resolution. Second, how to fuse predictions from the physics-based model SedCas and the machine learning based model Flow-Alert to merge and balance the strengths of these two modeling approaches. Third, how to reduce uncertainty when translating insights from the virtual entity back to the physical entity. We demonstrate that the digital twin framework enables potential users, such as governmental agencies and local stakeholders, to explore what if scenarios and better understand how climate change and human interventions influence sediment transport dynamics.

How to cite: Zhou, Q., Tang, H., Hirschberg, J., and Walter, F.: stTwin: A digital twin framework for catchment-scale sediments transport, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6409, https://doi.org/10.5194/egusphere-egu26-6409, 2026.

EGU26-7450 | ECS | Orals | ITS1.11/ESSI1.10

Digital Twin For Improvement of The Sustainability of Neighbourhoods Through Scenario Planning 

Rakibun Athid, Dr. Mila N. Koeva, and Dr. Pirouz Nourian

Digital twins as complex decision-making systems are increasingly used in climate adaptation and sustainability planning. However, most of the applications currently available remain largely sector-oriented, thus limiting their capacity to capture interactions between different domains with multiple interrelated indicator systems. This constraint is particularly applicable at the neighborhood scale, where planning interventions are applied, and trade-offs between competing objectives become most visible.  

This work introduces the prototype of a neighborhood-scale digital twin system, which is designed to support integrated, scenario-based analysis of urban ecology and energy systems. The digital twin implemented in the post-war residential neighbourhood of Twekkelerveld, Enschede, the Netherlands, attempts to solve major issues, such as the ageing building stock, limited green infrastructure, and relatively high energy demand. The framework incorporates open and municipal datasets, including tree inventories, green spaces, urban heat potential, building geometry, energy-use intensity, energy estimation, solar electricity potential, and carbon footprint. The system explicitly represents interactions among ecological and energy interventions at the neighbourhood level, unlike the existing tools.

The digital twin is designed to facilitate interactive "what-if" exploration of typical urban interventions across multiple domains. Ecological scenarios, such as tree planting strategies and green facade deployment, enable users to assess the impacts on greenness, urban heat mitigation, carbon sequestration, and investment costs. Energy scenarios include building insulation improvements, rooftop solar deployment, heat pump transitions, and local energy sharing, measured by the indicators on the level of the neighborhood and buildings. The interrelation module explicitly connects the ecological and energy measures, which allow the comparison of the combined effects on cooling, energy demand, emissions, and overall performance.

Instead of making sustainability planning a one-sector endeavour, the prototype assists in the exploration of options: what changes, what gets better, and what gets worse when various measures are combined. Presenting baseline and scenario outcomes side by side makes trade-offs clearer across ecological, energy, and environmental indicators. The work shows how neighbourhood-scale digital twins can operationalise multi-domain data and scenario logic in a form that is usable by urban planners, municipalities, and local decision-makers. This complements Earth system-scale digital twins, which are centered around the local level where interventions are discussed and implemented.

How to cite: Athid, R., Koeva, Dr. M. N., and Nourian, Dr. P.: Digital Twin For Improvement of The Sustainability of Neighbourhoods Through Scenario Planning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7450, https://doi.org/10.5194/egusphere-egu26-7450, 2026.

EGU26-7482 | Orals | ITS1.11/ESSI1.10

A Digital Twin for Wildfire risk adaptation planning: DT-WILDFIRE 

Eleftheria Exarchou, Mirta Rodriguez Pinilla, Veronica Martin Gomez, Marc Benitez Benavides, Martin Senande Rivera, Diego Bueso, Foteini Baladima, Guillem Canaleta, Mariona Borràs, Eleni Toli, and Panagiota Koltsida

Wildfires pose a growing threat to populated areas of the Mediterranean basin. Rural abandonment has increased fuel loads, creating appropriate conditions for large wildfires. The hot and dry conditions caused by climate change have exacerbated the risk, extent, and severity of wildfires. The rising number of homes in the wildland-urban interface (WUI) implies increasing impacts on lives and property from wildfires. The need for mitigation and adaptation measures against wildfire risk is thus becoming more urgent. The Barcelona Metropolitan Area, a large metropolis with an extended WUI (with more than 20000 inhabitants), is particularly vulnerable. A part of its population and infrastructure is located near the border of the Collserola Natural Park (8000 hectares with 6 million visitors yearly), an extended and concurred forested area, and could be potentially threatened by large forest fires, becoming at the same time a threat for the whole metropolitan area. 

This study presents a Digital Twin (DT) framework for the Barcelona Metropolitan Area, designed to assess the risk of extreme wildfires, and how it is impacted by heatwaves and droughts under different future emission scenarios. The DT-WILDFIRE leverages high-resolution climate model projections, satellite data, local observations, and advanced machine learning (ML) techniques to provide a granular understanding of future climate risks and their cascading impacts on wildfires. 

To quantify the fire risk, we calculate the Fire Weather Index (FWI), a widely recognized metric used to assess the potential for wildfire occurrence and spread, based on prevailing meteorological conditions. We calculate FWI over Catalonia at a resolution of 1.5 km during the historical period, using the EMO1 database. Validation against ERA5Land-derived FWI shows good agreement. This high-resolution FWI will then be used to downscale future FWI projections from climate models, thereby providing greater spatial detail in analyses of future climate change impacts on wildfires in the region. 

Further assessment of wildfire risk is provided by the wildfire susceptibility prediction model, based on the machine learning algorithm XGBoost.  The model is implemented over Catalonia and trained using diverse variables, including population density, electrical power infrastructure, terrain elevation, Normalized Difference Vegetation Index, land cover classifications, FWI, and historical burned area data. The model generates daily wildfire susceptibility maps at regional scale. Model evaluation based in the quadratic weighted Kappa metric indicates moderate to good predictive skill over most of the domain, except in high-elevation areas. Further detailed investigation in these regions is ongoing.  

Future climate risk related to wildfire drivers, such as droughts and heatwaves, is also assessed. To achieve the required resolution, we apply deep learning downscaling methodologies to produce future climate projections at very high resolution (0.8km). 

Finally, the DT aims at quantifying physical damage to residential and commercial real estate, including damage from smoke and business interruption. Ultimately, DT-Wildfire aims at helping authorities and society design participatory risk reduction measures, including nature-based solutions, according to the different climate scenarios.  

How to cite: Exarchou, E., Rodriguez Pinilla, M., Martin Gomez, V., Benitez Benavides, M., Senande Rivera, M., Bueso, D., Baladima, F., Canaleta, G., Borràs, M., Toli, E., and Koltsida, P.: A Digital Twin for Wildfire risk adaptation planning: DT-WILDFIRE, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7482, https://doi.org/10.5194/egusphere-egu26-7482, 2026.

EGU26-9426 | Posters on site | ITS1.11/ESSI1.10

Depth super-resolution of rock CT images based on latent diffusion models by deep learning 

Kosei Tomami, Atsushi Okamoto, and Toshiaki Omori
As one of the applications of X-ray computed tomography (X-ray CT) to geomaterials, rock CT images have been widely applied in earth and environmental sciences. However, the rock CT images have a low-resolution problem in the depth direction due to multiple causes such as physical characteristics of the rock core samples, geometric constraints of the imaging environments, and limitations in measurement in X-ray CT scanners. In this study, we propose a data-driven super-resolution based on generative modeling to improve the depth resolution of the rock CT images. Our proposed method solves the low-resolution problem as conditional generation by latent diffusion models which are a class of generative models. When we assume three consecutive images at different depth levels, a second image (an unobservable rock CT image) is generated from a first image and a third image (observable rock CT images) in our method. We verify the effectiveness of the proposed method by using actual rock CT images obtained in Oman Drilling Project, which is one of the international scientific research projects. The experimental results demonstrate advantages in the performance of our method in both qualitative and quantitative aspects compared to conventional interpolation methods.

How to cite: Tomami, K., Okamoto, A., and Omori, T.: Depth super-resolution of rock CT images based on latent diffusion models by deep learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9426, https://doi.org/10.5194/egusphere-egu26-9426, 2026.

EGU26-9537 | ECS | Orals | ITS1.11/ESSI1.10

Perspective of Interpretable Physics-Based AI method for Digital Twins of Geosystems 

Denise Degen, Yulia Gruzdeva, Nicolas Hayek, Marthe Faber, Cristian Siegel, and Mauro Cacace

The development of digital twins for subsurface applications faces several challenges, in this contribution we are focusing on the issue of providing near real-time predictions for numerical multi-physics applications describe by partial differential equations. Even when fronted against state-of-the-art high-performance computing infrastructures, conventional multi-physics simulations are not real-time compatible because of their huge computational demand. At the same time, they are subject to uncertainties from, for instance, the geometry, material properties, and boundary conditions.

To address the computational demand, we introduce the usage of surrogate models. Surrogate models comprise data-driven and physics-based approaches. While data-driven techniques, such as neural-networks, well capture complex system responses, they typically lack interpretability, hindering the degree of reliability of the model outcomes. This, in turn, poses challenges for the integration into digital twins especially in applications where risks need to be assessed. In contrast, physics-based approaches are fully interpretable, but often limited to elliptic and parabolic partial differential equations. Hence, they cannot capture the full complexity of the systems dynamics. To overcome the limitations of both data-driven and physics-based techniques, we introduce a hybrid approach namely the non-intrusive reduced basis method within the class of projection-based model order reduction techniques.

In this contribution, we demonstrate for a geothermal case study how this interpretable physics-based AI method can be used to reliably and efficiently accelerate the high-fidelity numerical multi-physics simulations. Furthermore, we illustrate their integration into a Bayesian uncertainty quantification framework, including hierarchical approaches. At last, we discuss possibilities to extend the aforementioned approaches to allow for a continuous integration of observational data.

How to cite: Degen, D., Gruzdeva, Y., Hayek, N., Faber, M., Siegel, C., and Cacace, M.: Perspective of Interpretable Physics-Based AI method for Digital Twins of Geosystems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9537, https://doi.org/10.5194/egusphere-egu26-9537, 2026.

EGU26-9595 | Orals | ITS1.11/ESSI1.10

FireAID - Real-time Wildfire Spread Modeling with Machine Learning 

Dominik Laux, Johanna Wahbe, Danica Rovó, Pranay Pratik, Veronika Pörtge, Lukas Liesenhoff, and Julia Gottfriedsen

Wildfires are a major type of disaster and challenge for economic prosperity, public health and safety around the globe. Decision Intelligence, particularly AI based scenario analysis, can make a significant difference [1] in disaster mitigation efforts. Data-driven methods have shown promise in various downstream applications [2]. Still, reference data remains a significant bottleneck across domains such as fire behaviour modeling.

We develop three data-driven decision intelligence tools: a novel machine learning based fire spread model, a fire break placement recommender, and triage decision support.
We make use of data from OroraTech’s global near-real-time fire monitoring network, which provides hotspot data from both public and proprietary satellites, in addition to burned area products.
We have created a novel dataset with thousands of fires from the US, Chile and Europe between 2022-2025. We enriched the thermal hotspot-based fire perimeters with a variety of EO (land cover, soil moisture, elevation, previously burned area, vegetation index) and non-EO (wind, temperature, relative humidity, dew point, and precipitation) data.

With this dataset, we train fire spread prediction models based on leading DL architectures. Graph Neural Networks (GNN) are particularly promising, since they have excelled in related domains such as weather forecasting [3], and showed promising spatial generalization properties for fire spread [4]. To mitigate uneven satellite overpass intervals, we treat the time gap between input-target images as an additional learning signal.
A major hurdle in the operational use of fire intelligence tools is a lack of user trust. Therefore, we incorporate explainability metrics in all three of key contributions.

The use of fire breaks - creating “barriers” of non-burnable materials to prevent fires from spreading - is a significant tactic in wildfire management. Scenario analysis tools are essential to inform the placement of fire breaks. Despite recent progress, significant challenges remain in this domain, such as reliance on basic fire spread simulators, and a complex action space for fire break placement [1]. We aim to close this gap by coupling our improved fire spread model combined with reinforcement learning, a promising approach pioneered in a recent case study [1] for fire break recommendations.

In conclusion, we present a novel fire dataset and operational tools for global, real-time fire spread modeling and  firebreak placement supporting wildfire management worldwide. 

References

[1]Murray,L.,Castillo,T.,Carrasco,J.,Weintraub,A.,Weber,R.,deDiego,I.M.,...&GarcíaGonzalo,J.(2024).Advancing Forest Fire Prevention: Deep Reinforcement Learning for Effective Firebreak Placement.arXiv preprint arXiv:2404.08523.

[2]Bot,K.,&Borges,J.G.(2022).A systematic review of applications of machine learning techniques for wildfire management decision support.Inventions,7(1),15.

[3]Lam,R.,Sanchez-Gonzalez,A.,Willson,M.,Wirnsberger,P.,Fortunato,M.,Alet,F.,...&Battaglia,P.(2023).Learning skillful medium-range global weather forecasting.Science,382(6677),1416-1421.

[4]Rösch,M.,Nolde,M.,Ullmann,T.,&Riedlinger,T.(2024).Data-Driven Wildfire Spread Modeling of European Wildfires Using a Spatiotemporal Graph Neural Network.Fire,7(6),207.

How to cite: Laux, D., Wahbe, J., Rovó, D., Pratik, P., Pörtge, V., Liesenhoff, L., and Gottfriedsen, J.: FireAID - Real-time Wildfire Spread Modeling with Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9595, https://doi.org/10.5194/egusphere-egu26-9595, 2026.

EGU26-10068 | ECS | Orals | ITS1.11/ESSI1.10

Towards a digital twin for modelling geothermal reservoirs in channelised fluvial systems  

Guofeng Song, Denis Voskov, Hemmo A. Abels, Philip J. Vardon, and Sebastian Geiger

Geothermal energy plays a key role in energy transition by offering a clean baseload alternative to fossil fuels for space heating. Long-term geothermal production is subject to inherent uncertainty due to the heterogeneity of geological formations that host the geothermal resource, and the limited data available to characterize and quantify these heterogeneities. It is insufficient to explore and quantify such uncertainty based on a single concept or interpretational scenario. The TU Delft campus geothermal project has been initiated to provide a dedicated research environment with the vision to scale-up the deployment of geothermal energy as well as providing and storing heat for the TU Delft campus. Inspired by the reservoir that hosts the geothermal resource at TU Delft - a channelised fluvial system - we are presenting a framework of an open-source digital twin for geothermal reservoirs that aims to integrate geological scenario modelling, production simulation, uncertainty analysis, and data assimilation to mitigate operational risks, reduce maintenance costs, extend reservoir longevity, and enhance the overall sustainability for geothermal production.

We propose a scenario-based geological modelling approach using Rapid Reservoir Modelling (RRM), in which channelised fluvial layer templates are stacked and constrained by facies information along well trajectories. Multiple geological scenarios with distinct channel distributions are generated. Heterogeneous petrophysical properties are then assigned to different facies in the reservoir models. Uncertainties in both, reservoir architecture and petrophysical properties, are captured. The flow and thermal simulations are performed with the open-source Delft Advanced Research Terra Simulator (open-DARTS), and production uncertainty is quantified by evaluating the impact of reservoir architectures and petrophysical heterogeneities. The Ensemble Smoother with Multiple Data Assimilation (ESMDA) is then applied across these scenarios to constrain production and reservoir forecasts using well temperature and pressure observations, tracer tests, and related monitoring data. Scenarios that fail to reproduce the observations after data assimilation are falsified, while data-worth analysis is conducted on the remaining plausible scenarios to provide a reliable evaluation of data acquisition strategies and identify the most cost-effective options for reliable assessment of geothermal production.

Our digital-twin framework enables us to explore a broader range of geological uncertainties and constrains production uncertainties, thereby enabling a more reliable assessment of geothermal reservoir performance and production forecasts, both of which are essential for optimizing operational strategies and supporting informed decision-making for geothermal systems.

How to cite: Song, G., Voskov, D., Abels, H. A., Vardon, P. J., and Geiger, S.: Towards a digital twin for modelling geothermal reservoirs in channelised fluvial systems , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10068, https://doi.org/10.5194/egusphere-egu26-10068, 2026.

Digital Twins of the Earth are required to represent future scenario- and trajectory-based hazards that obey physical laws and realistic dynamics in an interpretable and actionable manner, understandable not only by experts but also by non-expert stakeholders and local authorities, to support efficient decision-making, adaptation planning, and emergency management. Machine learning has substantially advanced generating landslide susceptibility maps (LSM). However, LSMs typically provide static, abstract, expert-oriented snapshots that are difficult for non-expert audiences to interpret and are poorly aligned with the interactive, immersive visualization needs of Digital Twin and Augmented Reality (AR)/Virtual Reality (VR) environments, thereby limiting their effectiveness for anticipatory risk communication and decision support.

We present a physics-aware generative framework that transforms predictive landslide modeling into photorealistic satellite imagery of future events, enabling intuitive “what-if” hazard exploration within Digital Twin architectures.

Our approach integrates Landslide Physics-Aware Neural Networks (LPANNs) with conditional Generative Adversarial Networks (GANs) to generate synthetic, post-event satellite images. These generate synthetic images conditioned on multi-attribute probability maps (physics-informed predictions) resulting from embedding geotechnical, hydrological, geomorphological, and geometric constraints, ensuring physical plausibility. Our developed conditional GAN is trained based on pre- and post-event real images, with annotated landslide areas. Different supervised and self-supervised deep learning are used for large-scale landslide detection.

By conditioning generative part of the approach on physics-informed predictions, the proposed Digital Twin component mitigates hallucinations typical of generative AI and synthetic images and trustworthy hazard visualizations. The resulting synthetic imagery is scenario-consistent and bridges the gap between numerical susceptibility outputs and human-centered decision support, enhancing interpretability for policymakers, emergency managers, and non-expert stakeholders.

 

How to cite: Ghorbanzadeh, O. and Crivellari, A.: From Physics-Aware AI to Digital Twins: Generating Photorealistic Satellite Imagery of Future Landslides for Predictive Hazard Scenarios, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10764, https://doi.org/10.5194/egusphere-egu26-10764, 2026.

EGU26-10926 | Orals | ITS1.11/ESSI1.10

WBGeo: An Automated Framework for Geosystem Modeling with Advanced Mesh Generation Capabilities 

Mauro Cacace, Marzieh Baes, Jan von Harten, Alexander Lüpges, Denise Degen, Jan Niederau, Tobias Rolf, Magdalena Scheck-Wenderoth, Florian Wellmann, Bernhard Rumpe, Nora Koltzer, and Simon Virgo

WBGeo (WorkBench for Digital Geosystems) aims at automating the workflow from geological data integration to structural modeling, mesh generation, numerical simulation, and visualization. The framework is designed as a collaborative project, enabling the systematic and reproducible development of geoscientific models while reducing manual intervention across the entire modeling pipeline.

One of the core components of WBGeo is the generation of computational meshes tailored to complex geoscientific workflows. The framework supports three mesh representations: implicit structured meshes, explicit structured meshes, and explicit unstructured meshes. This flexible design allows users to select an appropriate meshing strategy based on model complexity, data availability, and computational requirements.

Implicit structured meshes are generated from volumetric structural models in which lithological information is defined on a regular grid. The meshing procedure operates directly on the implicit representation of the structural geological model and produces a structured hexahedral mesh suitable for numerical simulations based on finite element or finite volume/difference methods.

For explicit structured meshes, vertices are first extracted directly from the geological surfaces as provided by the structural model. Each geological layer is first discretized using a uniform, user-defined number of interpolated points to ensure consistent lateral resolution across all layers. Subsequently, vertical refinement between adjacent layers is performed using a user-defined number of subdivisions, allowing controlled resolution along the depth direction. To preserve mesh quality and avoid numerical instabilities, the minimum vertical distance between corresponding points in adjacent layers is evaluated using a user-defined threshold. If this distance falls below the specified limit, one of the points is adjusted vertically by a predefined amount to enforce the minimum separation. Following this correction step, hexahedral elements are constructed, resulting in a structured mesh suitable for efficient numerical simulations.

For explicit unstructured meshes, vertices are obtained directly from the structural model geometry. The surfaces are then interpolated and discretized, and the resulting geometry is passed to the Gmsh Python API for mesh generation. After determining intersections between surfaces and performing geometric fragmentation, tetrahedral elements are generated. One of the main components of unstructured meshes in the workflow is the inclusion of fault planes, and engineering objects such as wells, mining shafts, point sources, or additional internal planes, which are difficult to represent within a structured mesh framework.

By supporting both structured and unstructured meshing strategies within a unified workflow, WBGeo enables users to balance computational efficiency and geometric complexity while maintaining reproducibility and consistency across geosystem modeling applications. The generated meshes can be exported to different formats such as exodus, abaqus, feflow to be used by different commercial and open-source simulation packages.

How to cite: Cacace, M., Baes, M., von Harten, J., Lüpges, A., Degen, D., Niederau, J., Rolf, T., Scheck-Wenderoth, M., Wellmann, F., Rumpe, B., Koltzer, N., and Virgo, S.: WBGeo: An Automated Framework for Geosystem Modeling with Advanced Mesh Generation Capabilities, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10926, https://doi.org/10.5194/egusphere-egu26-10926, 2026.

Accurate and efficient modelling of geothermal reservoirs is important for sustainable energy production and for the reliable assessment of operational risks. Predicting thermo-hydraulic (TH) system evolution under varying injection and production scenarios remains computationally challenging, particularly when physical knowledge of the subsurface system is incomplete and observational data are sparse. High-fidelity finite-element simulators are typically used to provide physics-based predictions of coupled flow and heat transport governed by complex partial-differential equations (PDEs). Such full-order simulations are, however, often prohibitively expensive for real-time forecasting, which is essential, for instance, in the context of digital twins.

Physics-based machine-learning (PBML) approaches, such as the non-intrusive reduced basis (NIRB) method address this challenge by constructing physics-consistent surrogate models that project full-order simulation outputs onto a low-dimensional subspace learned from representative snapshots. By retaining only the dominantbasis functions, the NIRB surrogate enables orders-of-magnitude speedup in parametric predictions while staying consistent with the physical transport mechanisms and structural assumptions on fracture networks encoded in the full-order model. Despite these advantages, classical NIRB surrogates are intrinsically limited to the physical regimes represented by the governing PDEs, and consequently by the training simulations. If the surrogate does not fully capture the observed system behaviour, it is important to detect and adapt to missing or misrepresented local physics revealed by observational data, such as unmodeled convective heat transport or flow channelling arising from fracture activation.

To address this need, we propose a complementary residual-learning framework that augments a baseline NIRB surrogate with parameter-to-state maps of residual temperature and pressure fields learned by Kolmogorov-Arnold Networks (KANs). The residual, defined as the difference between observed data (or a synthetic reference solution) and the NIRB model prediction, is interpreted as a proxy for missing or misrepresented physics not explicitly captured by the baseline model. KANs represent mappings as sums of learned univariate functions and provide explicit access to the functional structure of parameter dependence. Thereby, KANs could act as interpretable discrepancy models by learning the residual between observations and NIRB predictions. By analysing the dominant functional families emerging in the learned residual, such as linear dependence characteristic of conduction-dominated regimes or exponential dependence associated with convection, KANs can provide diagnostic insight into missing thermo-hydraulic processes and their relevance across parameter regimes.

We validate the proposed approach synthetically by comparing a conduction-only NIRB surrogate against synthetic reference observations generated with an advection–diffusion model. We expect that KAN-based residual learning both improves predictive accuracy and reveals clear functional signatures of missing convective physics, even when only pointwise information is available. As an outlook, we aim to apply this workflow to real geothermal case studies, where sparse temperature and pressure measurements are available at well locations. In such settings, functional-family learning of residuals offers a promising pathway to improve surrogate predictions and to enhance the physical interpretability of geothermal systems, ultimately supporting more reliable assessments of reservoir behaviour.

How to cite: Faber, M., Cacace, M., and Degen, D.: Detecting Unrepresented Physics in Hybrid Machine Learning Surrogates of Geothermal Systems using Kolmogorov-Arnold Networks , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11263, https://doi.org/10.5194/egusphere-egu26-11263, 2026.

Deep aquifers offer significant potential for diverse energy and storage applications, these opportunities also will require synergistic multi-user subsurface management. To maximize these resources, operators require flexible modeling tools capable of rapidly evaluating how independent but concurrent projects might interact hydraulically over time. Traditional grid-based numerical models are robust but can be computationally demanding when rapid scenario testing is required across large, heterogeneous regions. We propose a modular Physics-Informed Neural Network (PINN) framework designed to provide a flexible, faster alternative for evaluating regional pressure interference between co-located subsurface activities.

Our proposed architecture treats the aquifer as a continuous volumetric field. We define injection and extraction points as dynamic operational conditions (e.g., transient rate or pressure constraints) that can be positioned anywhere in the domain. The neural network is trained to satisfy the 3D transient diffusivity equation, learning to map the relationship between these sources and the resulting pressure field without relying on fixed meshes We address this by introducing a  "modular" architecture: by training separate sub-networks for each activity type, we aim to mathematically isolate or "de-mix" the pressure contribution of specific projects from the total regional signal.

This research focuses on a case study in the Campine Basin (Belgium). We are developing the framework to infer effective aquifer properties from sparse historical monitoring data and to simulate interference patterns specifically between gas storage and geothermal operations. The expected outcome is a spatial scenario analysis tool that allows future users to dynamically test new project locations and optimize setback distances within a Subsurface Digital Twin environment. By decoupling the geological parameterization from specific well locations, we aim to provide a scalable engine that supports adaptive planning and de-risks decision-making in multi-activity aquifers.

How to cite: Rodriguez, J. D., Piessens, K., and Welkenhuysen, K.: A Modular Physics-Informed Neural Network Framework for Quantifying Pressure Interference Between Concurrent Deep Subsurface Activities, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11702, https://doi.org/10.5194/egusphere-egu26-11702, 2026.

EGU26-13008 | Orals | ITS1.11/ESSI1.10

Digital twins in climate science: challenges and opportunities  

Andrea Toreti, Arthur Hrast Essenfelder, and Valerio Lucarini

In recent years, advancement in computational infrastructures has made possible to start exploiting the implementation and use of digital twins in climate science. A growing number of studies and prototypes have already appeared, aiming at modelling single or multiple components of the Earth system. Among them, it is worth mentioning the European Commission's Destination Earth initiative with the ambition of realizing a digital replica of the Earth. While the development of digital twins seems straightforward and is proceeding at fast pace, there are still some key conceptual issues and challenges to overcome and go beyond classic numerical models and digital shadows. Realising continuous bidirectional data flow between the virtual system and the real one is among them. Together with innovative approaches in data assimilation and the integration of physics-consistent machine learning, there is the need to conceptualize what continuous data loop means at time scales covering the coming years and decades. Furthermore, the need to address the "human-in-the-loop" requirement remains central to allow for actionable "what-if" scenario testing. In this contribution, we discuss these open issues as well as the minimum requirements twins should have. We conclude by proposing pathways to fulfil the ambition of having a digital twin of the Earth system. 

How to cite: Toreti, A., Hrast Essenfelder, A., and Lucarini, V.: Digital twins in climate science: challenges and opportunities , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13008, https://doi.org/10.5194/egusphere-egu26-13008, 2026.

Disaster risk management (DRM) faces increasing challenges due to urbanisation, environmental degradation, and the growing complexity of interacting hazards. Digital Twins (DTs), defined as digital representations of physical systems connected through continuous data exchange, have gained attention for their potential to support monitoring, simulation, and decision-making. However, their application to disaster contexts remains limited, as many DT implementations depend on uninterrupted automated data streams, predefined control mechanisms, and automated interventions that are often unavailable or impractical during disasters.

In this study, the Digital Risk Twin (DRT) is introduced as a paradigm specifically designed for DRM. The DRT extends DT concepts by integrating automated and manual data collection methods, such as IoT, remote sensing, surveys and field observations, while incorporating human-in-the-loop decision-making for flexible and effective interventions, maintaining real-time virtual simulations, and addressing disaster scenario challenges. To demonstrate its practical relevance, an example of how a DRT can be conceptualised for a multi-hazard response case study is formulated, illustrating how DRT can support effective DRM.

The DRT integrates diverse data sources such as remote sensing, in situ observations, field surveys, and community-based reporting, while supporting both automated analysis and expert-driven interpretation. A defining feature of the framework is the explicit inclusion of human decision-making within the digital representation. Rather than aiming for full automation, the DRT enables iterative interaction between digital models and stakeholders, supporting context-aware decisions under uncertainty. This is particularly important in disaster situations where data gaps, infrastructure damage, and rapidly changing conditions constrain the effectiveness of purely automated systems.

Digital Risk Twins represent a conceptual advancement over original Digital Twins by addressing the socio-technical nature of disaster risk. The proposed framework and multi-hazard conceptualisation provide a foundation for future operational implementations, with the potential to strengthen adaptive capacity and resilience to cascading and compound hazards.

How to cite: Ghaffarian, S.: Digital Risk Twins: The Next Generation of Digital Twins for Complex Disaster Scenarios, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14599, https://doi.org/10.5194/egusphere-egu26-14599, 2026.

EGU26-15220 | ECS | Orals | ITS1.11/ESSI1.10

Digital twin–based voxel-scale clumping index (CI) improves leaf area density (LAD) retrieval from simulated terrestrial laser scanning (TLS) 

Raja Ram Aryal, Timothy Devereux, Josh Rivory, Glen Eaton, Stuart Phinn, and William Woodgate

Accurately representing three-dimensional (3D) canopy structure is essential for Earth System Models (ESMs) and radiative transfer schemes that link vegetation to climate–carbon feedback. Leaf area density (LAD) and related structural metrics are widely retrieved from remote sensing using Beer–Lambert (BL) transmittance inversions, yet these approaches commonly assume randomly distributed foliage and woody material. In real canopies, plant material is spatially aggregated (clumped), violating random mixing and introducing systematic LAD bias. Although clumping has been corrected using canopy or crown scale clumping indices (CI), voxel-based LAD retrievals from terrestrial laser scanning (TLS) and other 3D sensing approaches require clumping information that is defined at the same spatial scale as the inversion. The lack of a physically grounded voxel-resolved CI remains a key methodological gap, particularly for dense and heterogeneous canopy regions.

 

Here, we develop a voxel-scale effective reference clumping index (CI_ref) retrieval method that is structurally consistent with voxel-based BL retrievals. We used digital twin 3D tree meshes from the RAMI-V benchmark forest scenes, spanning six contrasting crown forms and six leaf inclination angle distribution (LIAD) variants (36 canopy geometries). Each tree was partitioned into regular voxel grids at four sizes (0.2, 0.5, 1.0, and 2.0 m). Within each voxel, we performed multi-directional (18 bin viewing angle) ray tracing on every voxel-clipped mesh to directly quantify within-voxel gap probability, leaf projection function G(θ), and path-length statistics required for transmittance-based LAD inference. Directional CI estimates were derived for each viewing angle and then aggregated through a hierarchical pooling strategy that reduces sampling noise and directional variability (all angles → azimuth pooled → zenith-pooled). This procedure yields a single, robust CI_ref per voxel that is independent of viewing angle and suitable as a reference label for operational LAD retrieval algorithm development from LiDAR data.

We then quantified the practical impact of voxel-scale clumping correction on BL LAD retrieval using simulated TLS point clouds. LAD was estimated per voxel under two assumptions: (i) the conventional random-foliage case (CI = 1) and (ii) clumping-corrected inversion using CI_ref. Across all crown forms, LIAD variants, and voxel sizes, the CI = 1 assumption produced predominantly negative LAD errors relative to mesh-derived reference LAD, consistent with systematic underestimation when clumping is ignored. Incorporating CI_ref shifted LAD errors toward zero and improved agreement, evidenced by reduced bias and normalized RMSE. Improvements were most pronounced for planophile canopies, where directional foliage aggregation is strongest and for coarser voxel sizes (1.0–2.0 m), where greater within-voxel heterogeneity amplifies departures from random mixing, demonstrating that clumping-induced bias is strongly scale dependent.

These results provide practical recommendations for 3D canopy modelling: specifically, that voxel-scale clumping correction becomes increasingly essential as voxel size increases, especially when within-voxel heterogeneity grows. The proposed CI_ref framework strengthens scale consistency between local canopy structure and voxel-based radiative transfer, enabling unbiased LAD retrievals and providing physically grounded labels for future deep learning model-based CI prediction from TLS point clouds.

How to cite: Aryal, R. R., Devereux, T., Rivory, J., Eaton, G., Phinn, S., and Woodgate, W.: Digital twin–based voxel-scale clumping index (CI) improves leaf area density (LAD) retrieval from simulated terrestrial laser scanning (TLS), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15220, https://doi.org/10.5194/egusphere-egu26-15220, 2026.

EGU26-16511 | Posters on site | ITS1.11/ESSI1.10

The DEGREE Project: A Digital Laboratory for Geothermal Exploration in the Eifel Region 

Matthias Volk, Jacob Alexander Frasunkiewicz, Patrick Laumann, and Atefeh Rahimi

Recently, the active Eifel volcanic region has received increasing interest due to the occurrence of deep low-frequency earthquakes, often interpreted as a sign of rising volatiles in the crust. Additionally, recent tomographic models have resolved vertically inclined low-velocity anomalies beneath the Laacher See volcano, which may indicate enhanced fluid ascent. These observations raise the question of whether volcanic activity in the region is increasing and whether such activity may be beneficial for geothermal exploration.

To address these questions, the DEGREE project is developing a digital laboratory that enhances predictive capabilities by combining geophysical data with geological and numerical models. The laboratory includes workflows that couple data assimilation, geological modeling, and numerical simulations into a single process. A key challenge is the propagation of uncertainties in the input data and parameters through the entire workflow. This allows us to obtain quantitive uncertainties for derived quantities to support decision making.

The foundation of the laboratory is a collection of diverse datasets compiled during the project. An extensive seismic dataset acquired by the Eifel Large-N network, deployed between September 2023 and September 2024, is used to investigate subsurface structure and active geodynamic processes in the Eifel region. We employ seismic tomography methods to resolve crustal thickness variations and velocity anomalies, together with moment tensor inversion to constrain fault geometries and deformation mechanisms.

Surface geological maps, digital elevation models, and geological cross-sections are used to build 3D structural geological models using the open-source software GemPy. Model construction follows a stepwise approach, starting from a simplified stratigraphic framework and gradually adding geological complexity, such as time-equivalent units and major fault structures. Although the steps are applied sequentially, the geological model is constructed from the input data and is therefore reproducible, enabling integration into subsequent workflow steps.

GemPy addresses uncertainty in geological models by generating ensembles of realizations through sampling input parameters from probability distributions. These ensembles serve as inputs for numerical simulations of physical quantities. Computing adjoint sensitivity kernels allows us to assess how each realization affects model outputs and to identify which models best match available observations, integrating structural uncertainty with process-based simulations. The numerical simulations are performed with LaMEM and its bindings for the Julia programming language. As GemPy is written in Python, the GemPy.jl package has been developed to expose its functionality in Julia.

The resulting geological and geophysical models may serve as the basis for a Play Fairway Analysis (PFA) which identifies regions with high potential for geothermal exploration. Crucially, this type of analysis requires uncertainty estimations for the modeled physical quantities, which our workflow provides.

From an implementation perspective, the digital laboratory consists of three main parts: a repository to collect data and models and their metadata, workflows and infrastructure for automatic processing, and an interface for visualization and interaction with the results. To demonstrate the feasibility, we develop the first prototype in JupyterLab which accommodates different computing environments and enables an interactive development process.

How to cite: Volk, M., Frasunkiewicz, J. A., Laumann, P., and Rahimi, A.: The DEGREE Project: A Digital Laboratory for Geothermal Exploration in the Eifel Region, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16511, https://doi.org/10.5194/egusphere-egu26-16511, 2026.

EGU26-16891 | Posters on site | ITS1.11/ESSI1.10

Orison: A modular data assimilation environment for subsurface digital twins 

Théophile Lohier, Antoine Armandine les Landes, Jeremy Rohmer, and Romain Chassagne

Subsurface Digital Twins rely critically on assimilation of data frameworks to continuously integrate multi-source, multi-type observations. While numerous methods have been developed to improve quantitative subsurface predictions, there is currently no clear consensus or standardised guidance on their appropriate computational deployment within digital twin workflows. Instead, research communities often adopt specific algorithms primarily because they are prevalent within their discipline, rather than because they are demonstrably optimal for the problem at hand. This lack of consensus reflects our limited understanding of how to rigorously characterise the mathematical structure of subsurface assimilation problems involving coupled multi-physics processes, multiple spatial and temporal scales, and heterogeneous data streams. As a result, current efforts frequently focus on empirical experimentation with algorithms rather than on the design of problem-adapted methodologies. This challenge extends to the formulation of the inverse problem itself, including parameterisation, parameter ranges, objective functions, and performance metrics, as well as to the selection of optimisation or inference strategies in multi-source data environments. Furthermore, comprehensive uncertainty quantification through global multi-factor sensitivity analysis is often infeasible due to the prohibitive computational cost of large-scale problems. To address these challenges, we propose Orison, a modular data assimilation environment designed to support systematic benchmarking and comparative analysis of classical model update algorithms for subsurface digital twin workflows. Orison enables controlled experimentation across a range of thematical problems, facilitating insight into algorithm performance and robustness. We demonstrate the capabilities of Orison through representative case studies in geothermal systems and groundwater management, illustrating how such a benchmarking framework can support more transparent methodological choices and contribute to the development of reliable, pragmatical subsurface digital twins.

How to cite: Lohier, T., Armandine les Landes, A., Rohmer, J., and Chassagne, R.: Orison: A modular data assimilation environment for subsurface digital twins, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16891, https://doi.org/10.5194/egusphere-egu26-16891, 2026.

An informed-decision making in managing risks due to climate-driven hazards for emergency response, designing preventive interventions, or policymaking for future requires either short-term and scenario-based assessments or long-term and uncertain assessments. Data requirements, spatial and temporal scales, observations required, and modelling techniques employed change drastically depending on the scope of the risk assessment. Digital twins (DT) in applications for natural hazards provide a great opportunity for significant improvements in disaster management. What makes DTs possible today is various technological advancements such as embedded sensors, cloud computing, edge computing, IoT. However, DTs also require a digital representation of the physical counterpart, mostly in the form of a computational or a data-driven model, to be able to predict future states. The utilisation of complex computational models in DTs is generally hindered by their relatively high computational budget and runtimes. A pathway to involve such models in (near) real-time decisions in DTs for geohazards is surrogate modelling. They are statistically valid representations of the computational model, into which physical laws and constraints can be embedded. Physics-compliant, physics-based or physics-informed surrogate models can facilitate DTs with i) instantaneous predictions, ii) the ability to conduct uncertainty quantification and sensitivity analysis to ensure reliability, iii) online updating of model parameters based on advanced calibration routines, iv) increased trust due to explainability based on physical laws. We present herein surrogate modelling as an enabler to replace computational models predicting the runout behaviour of geophysical flows. We investigate their applicability in uncertainty quantification, global sensitivity analysis, Bayesian parameter estimation, Bayesian model selection, and optimal experimental design. We demonstrate our workflow with two open-source computational models, r.avaflow 4.0 and synxflow, with synthetic and real-world case studies.

How to cite: Yildiz, A. and Kowalski, J.: Surrogate modelling as enabling methodology for predictive Digital Twins in geohazards, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17851, https://doi.org/10.5194/egusphere-egu26-17851, 2026.

EGU26-17948 | Posters on site | ITS1.11/ESSI1.10

A Hierarchical multi-fidelity approach for Bayesian inference for numerical process simulations  

Yulia Gruzdeva, Denise Degen, and Mauro Cacace

A key prerequisite for reliable geoscientific process simulations is the calibration of uncertain model parameters against field observations. In practice, both measurements and simulation outputs are subject to uncertainty, arising from the observational errors, limited knowledge of material properties and inexact physical models. Bayesian inference provides a framework to explicitly acknowledge multiple sources of uncertainty by encoding modelling assumptions in prior distributions and updating them against observational data through the likelihood to obtain posterior estimates. However, applying Bayesian methods remains challenging in coupled multiphysical applications, including thermo-hydro-mechanical problems, as computational costs of repeated forward evaluations grow rapidly with model complexity.  

To address these limitations, we develop a hierarchical simulator for Bayesian calibrations that dynamically combines fast low-fidelity surrogate models with accurate high-fidelity finite-element simulations during the sampling stage. The core of the method stems from a fidelity-selection policy embedded directly in the probabilistic model, which transparently accounts for both surrogate-induced bias and the computational cost associated with high-fidelity simulations. We provide and compare several scenarios, that represent different optimization strategies for balancing posterior accuracy and computational efficiency. The resulting hierarchical Bayesian workflow is highly modular, and it can be coupled with external high-fidelity solvers through a unified forward interface and hence applicable to a wider range of geoscientific problems.

How to cite: Gruzdeva, Y., Degen, D., and Cacace, M.: A Hierarchical multi-fidelity approach for Bayesian inference for numerical process simulations , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17948, https://doi.org/10.5194/egusphere-egu26-17948, 2026.

EGU26-19643 | Orals | ITS1.11/ESSI1.10

Digital twins for subsurface systems based on algebraic models 

Vasily Demyanov and Oleksandr Letychevskyi

One of the key challenges addressed by digital twins (DT) is the long-term modelling and monitoring of subsurface system behaviour. Existing DT technologies primarily rely on physics-based models capable of simulating dynamic processes. Long-term forecasting often suffers from uncertainty in data, modelling equations and their parameters, initial conditions and accumulating errors.

DT for natural systems remains an unexplored opportunity at a juvenile stage. Challenges with DT design for natural systems are largely related to their complex and uncertainty multi-physics nature.

We propose algebraic approach for DT design, where system parameters/attributes are represented as constraints rather than as specific values. This approach enables generation of subsurface scenarios and analysis of possible occurrence of critical system states/event.

We model the system as a collection of interacting entities (agents), whose states are defined by sets of attributes. For instance, a geological layer is considered as an agent characterised by its geometry represented by a 3D mesh (X0) , elasticity (E) , porosity (φ), thermal conductivity (T), and other relevant attributes. The initial state S0 of the agent can be presented as a set of constraints.

S0:        E1≤E≤E2 Ʌ  F1≤φ≤F2 Ʌ T1≤T≤T2 Ʌ X0 ,

The geometry X0 can also be represented as a set of constraints that take into account structural/mesh uncertainty. Thus, constraints can be specified for the set of all agents/layers interacting with each other.

We define the semantics of the agent's actions using formalized transitions that changes the constraints on the attributes/agent's state. An example of such a transition is the change in the layer state according to a function that is constructed from a combination of the equilibrium equations F, the constitutive equation Q, which relates the stress σ and the strain ε, and the kinematic equation of the strain D.

S1=G(S0, F(X0,σ), Q(φ,E,T), D(X0, ε)) .

The next state S1 is determined by the change of the agent state with the modelling of this transition and also represents the conjunction of constraints. The resulting new state is checked for compatibility with the critical state Z(σ, σmax) following the threshold constraint (eg fracture):

σ <= σmax .

If conjunction  S1 Ʌ Z(σ, σmax) is satisfied, then there are such layer attribute values for which it is true. Such attributes are represented by the corresponding constraints generated by the solver. Having such constraints, we can obtain scenarios by the method of backward modelling, which will lead to the initial state.

Formalized transitions can be built by considering other parallel processes that affect the change in the state of the agent, in particular thermal, chemical, fluid flow.

This approach increases capability for long-term forecasting because it operates with subsurface states/events constraints/conditions rather than parameter specific simulations.

DT can combine algebraic modelling with neural networks that classify the predictions of a certain event. Algebraic modelling of the agent's behaviour from the classified state will confirm the correctness of the classification and build the corresponding explanatory scenario.

How to cite: Demyanov, V. and Letychevskyi, O.: Digital twins for subsurface systems based on algebraic models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19643, https://doi.org/10.5194/egusphere-egu26-19643, 2026.

Jordan’s dryland watersheds face acute water stress alongside increasing land degradation. Intense, short-duration storms cause flash runoff that accelerates soil erosion and sediment delivery to downstream infrastructure while groundwater, which is Jordan’s primary strategic water source, remains under long-term pressure. Rainwater-harvesting (RWH) interventions, including Vallerani micro catchments on damaged hillslopes, and Marab/flood-spreading and check-dam systems along ephemeral waterways, are increasingly used in restoration efforts. However, basin-scale planning is often limited by uncertainties in hydrological trade-offs and a gap between model outputs and stakeholder-ready, spatially explicit decision support.

This study develops a basin-scale hydrological Digital Twin (DT) for the Mujib Basin located in central Jordan by transforming process-based simulation findings into an interactive, scenario-driven dashboard. The DT combines a hydrological modelling core (SWAT) with harmonized in-situ and Earth Observation (EO) datasets to represent both water and land-surface responses. Physiographic inputs such as topography, soils, and land use, together with meteorological forcing derived from ERA5 reanalysis, and complemented by EO time series including Sentinel-2 vegetation indices, evapotranspiration products, and soil moisture to support the ecohydrological context.

Four intervention scenarios are represented - baseline, Vallerani, Marab, and combined - and evaluated using indicators relevant to water security, including surface runoff, sediment yield, and groundwater recharge, alongside vegetation/ET-related metrics. Outputs are produced at the sub-basin level and visualized through a web-based 3D dashboard, allowing users to visualize and compare different scenarios. The DT also enables "what-if" scenario testing by combining suitability-driven intervention placement with adjustable weather perturbations, allowing users to explore combined management and climate futures.

Beyond single-variable maps, the DT adds a decision layer for intervention targeting through a composite appropriateness framework matched with actual restoration goals: (1) Marab/check-dam suitability, which emphasizes high runoff generation, terrain controls, and proximity to channel networks; (2) infiltration-focused suitability, which highlights zones where slowing and spreading flow can increase recharge.  This study shows how digital twins can support hydrological decision-making in data-scarce dryland settings by bridging modelling outputs and implementation-oriented planning, usin Mujib Basin as a case study.

How to cite: Procheta, N., Koeva, M. N., and Aguilar, R. R.: Developing a Digital Twin Framework for Watershed Restoration Scenario Analysis:A Case Study in Mujib Basin, Jordan, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20401, https://doi.org/10.5194/egusphere-egu26-20401, 2026.

EGU26-20704 | Orals | ITS1.11/ESSI1.10

Multi-Source Tsunami Hazard Assessment for Digital Twin Workflows 

Erlend Storrøsten, Brian Carlton, Valentina Magni, Naveen Ragu Ramalingam, Steven J. Gibbons, and Finn Løvholt

Recent advancements in the Digital Twin Component for Tsunamis, developed within the EU-funded DT-GEO project, are transforming rapid hazard assessment from static pre-computed databases to dynamic, data-informed workflows.  In this presentation, a novel workflow for Probabilistic Tsunami Forecasting (PTF) due to earthquake-triggered landslides is presented through a site demonstrator for the Mediterranean Sea motived by the 1908 Messina Strait earthquake and tsunami. A key innovation is the integration of earthquake-triggered submarine landslides and the application of AI driven inundation emulators for rapid prediction linked to earthquake workflows and related shakemaps. In addition, we showcase possible use of the workflow for new geophysical settings for a submarine slope in Southwest India. These synergies between digital twin architectures and machine learning provide a robust framework for anticipatory action and disaster risk management at both regional and global scales.  

This work was partially funded by the EU DT-GEO project (A Digital Twin for GEOphysical extremes, https://dtgeo.eu/) through the European Union’s Horizon Europe research and innovation programme under grant agreement nº 101058129 and PCTWIN project, jointly funded by the Natural Environment Research Council (NERC), UKRI and the Ministry of Earth Sciences (MoES), Government of India (Grant: NE/Z503496/1). 

How to cite: Storrøsten, E., Carlton, B., Magni, V., Ragu Ramalingam, N., Gibbons, S. J., and Løvholt, F.: Multi-Source Tsunami Hazard Assessment for Digital Twin Workflows, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20704, https://doi.org/10.5194/egusphere-egu26-20704, 2026.

EGU26-21582 | ECS | Orals | ITS1.11/ESSI1.10

From climate simulations directly to actionable insights: The Climate Change Digital Twin 

Theresa Kiszler, Jenni Kontkanen, Brynjar Sigurdsson, Bruno de Paula Kinoshita, Pierre-Antoine Bretonniere, Devaraju Narayanappa, Mario Acosta, Suraj Polade, Outi Sievi-Korte, Thomas Jung, Daniel Klocke, Francisco Doblas-Reyes, Nikolay Koldunov, Aina Gaya-Àvila, Jost von Hardenberg, Paolo Davini, Barbara Frueh, Stephan Thober, Sebastian Milinski, and Francesc Roura Adserias and the Climate DT team

The Climate Change Adaptation Digital Twin (Climate DT), developed as part of the Destination Earth Initiative, produces global multi-decadal kilometer-scale simulations (5 – 10 km) in a new operational framework. A significant achievement in Climate DT is the capability to automatically process the hourly model output with impact applications which provide insights for users. Such applications include for instance the analysis of flood risks, renewable energy generation and wildfire risks. Therefore, Climate DT data can provide direct insights into potential adaptation requirements. Additionally, the Climate DT runs with multiple climate models (IFS-FESOM, IFS-NEMO and ICON) which led to the implementation of a standardized data portfolio on HealPix meshes, further benefiting data users in analyzing the data.

In this presentation we will introduce the operational Climate DT framework as well as the workflow that enables us to perform the climate simulations with automatic post-processing by multiple applications including scientific evaluation. Other aspects that will be introduced are the standardized data-portfolio and the simulations that have been performed so far as part of Climate DT.

How to cite: Kiszler, T., Kontkanen, J., Sigurdsson, B., de Paula Kinoshita, B., Bretonniere, P.-A., Narayanappa, D., Acosta, M., Polade, S., Sievi-Korte, O., Jung, T., Klocke, D., Doblas-Reyes, F., Koldunov, N., Gaya-Àvila, A., von Hardenberg, J., Davini, P., Frueh, B., Thober, S., Milinski, S., and Roura Adserias, F. and the Climate DT team: From climate simulations directly to actionable insights: The Climate Change Digital Twin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21582, https://doi.org/10.5194/egusphere-egu26-21582, 2026.

Objectives:

In this study, we aim to develop a data-driven petrophysical inversion technique in the context of CO2 sequestration. By integrating reservoir flow simulation, petroelastic modeling, and Graph Neural Networks (GNNs), CO2 saturation can be estimated within models with multi-grid resolutions. The goal is to enhance accuracy and resolution adaptability in predicting CO2 plume behavior in subsurface geological formations, thus improving carbon capture and storage (CCS) strategies.

Methodology:

We generated 100 2-dimensional synthetic reservoir models using a sequential indicator simulation algorithm for facies simulation, each populated by heterogeneous porosity and permeability. Flow simulations were conducted for 11 years using a central well with a constant injection rate. Petroelastic modeling was then performed to compute changes in P-wave and S-wave velocities and density every six months. The models were resampled to mimic a varying resolution scenario, with higher resolution near the well. A GNN model handled multi-resolution inputs and outputs, representing each grid as a node linked to its nearest eight neighbors, using direction and distances as edge attributes.

Results, Observations, and Conclusions:

The integrated modeling approach successfully predicted CO2 plume migration within geological formations, demonstrating high predictive accuracy and robustness. Petroelastic modeling revealed significant changes in reservoir properties such as P-wave and S-wave velocities and density due to CO2 injection. The Graph Neural Network (GNN) model, optimized through hyperparameter tuning, effectively utilized these changes to predict CO₂ saturation with a Mean Squared Error (MSE) of 0.0217 and a Coefficient of Determination (R²) of 0.981, confirming its high reliability in practical scenarios. In comparison, the Multilayer Perceptron model (MLP) achieved an MSE of 0.0260 and an R2 of 0.9695, processing data without considering spatial connections, underscoring the GNN's superior computational efficiency and spatial data integration. Furthermore, visual assessments confirmed the model’s accuracy, closely aligning predicted and actual CO2 saturation levels, especially in dynamically changing reservoir zones. The study concludes that combining static property modeling, flow simulation, petroelastic modeling, and GNNs provides a valuable tool for enhancing CO₂ sequestration strategies, improving the prediction accuracy of CO₂ behavior in the subsurface, and significantly advancing CCS technologies.

Novel/Additive Information:

Our work leverages Graph Neural Networks (GNNs) to predict changes in CO2 saturation from elastic properties, integrating flow dynamics with petroelastic modeling and deep learning via adaptive meshing grids. This novel approach addresses the limitations of conventional neural networks in adapting to mesh variations. Our project uniquely targets the complex challenges of CO2 monitoring, advancing sequestration monitoring technologies by bridging seismic monitoring and dynamic flow simulation, providing a tool to predict CO2 saturation from elastic properties.

 

How to cite: Alfayez, H.: Physics-Informed Graph Neural Networks for Multi-Resolution CO₂ Saturation Estimation in Subsurface , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22140, https://doi.org/10.5194/egusphere-egu26-22140, 2026.

EGU26-23011 | Orals | ITS1.11/ESSI1.10 | Highlight

An Operational Earthquake Digital Twin Based on Empirical Ground-Motion Models and Period Estimation: Integration of SEISAID and B-Wave within the VIGIRISKS Platform 

Caterina Negulescu, Pierre Gehl, Samuel Auclair, Didier Bertil, Yoann Legendre, Romain Guidez, Hajatiana Ramambazafy, Franck Chan Thaw, Cecile Gracianne, Roser Hoste Colomer, Agathe Roulle, and Gilles Grandjean

Digital Twins (DTs) are increasingly used as integrative frameworks to combine data streams, numerical models and automated workflows for monitoring complex systems and supporting decision-making. In the field of seismic risk management, operational DTs must rely on fast, robust and reproducible modelling approaches, capable of assimilating real-time observations despite strong epistemic uncertainty. This contribution presents an operational earthquake DT implemented on the VIGIRISKS platform, and illustrated through two complementary rapid-response tools: SEISAid, dedicated to territorial-scale impact assessment, and B-Wave, focused on near real-time structural damage monitoring.

Rather than relying on detailed physics-based representations of subsurface processes, the proposed DT is built upon empirical ground-motion models and vulnerability models, which can be considered as meta-models linking observed seismic signals to expected ground motion and damage. Real-time seismic data from regional and national monitoring networks are continuously ingested through Pulsar approach. Seismic intensity fields are generated using the USGS ShakeMap framework, which embeds data weighting and uncertainty propagation to combine ground-motion prediction equations, instrumental recordings, macroseimic observations, and site-effect information. These ShakeMap products are then encapsulated within the VIGIRISKS infrastructure, where they trigger automated impact assessment workflows.

At the territorial scale, SEISAid exploits ShakeMap outputs and empirically calibrated vulnerability models to estimate building damage and potential human losses within 15–30 minutes after earthquake detection. Calculations are performed using reproducible scientific codes hosted on VIGIRISKS, and results are automatically aggregated and disseminated to decision-makers through standardized notification reports. This workflow supports rapid situational awareness and early operational decision-making under uncertainty.

At the structural scale, B-Wave extends the DT by integrating recorded dynamic responses from instrumented buildings. Damage assessment relies on data-driven signal processing methods, such as continuous wavelet transform–based frequency identification, to detect changes in structural dynamic properties. These changes are empirically related to damage states aligned with European EMS-98 classes, enabling near real-time alerts on the condition of critical structures without requiring detailed mechanical models.

A key characteristic of the framework is its event-driven and iterative cycle: each new earthquake updates data, models and outputs, progressively enriching the DT. By embedding empirical modelling, uncertainty handling and updating (via ShakeMap), and automated decision support within a unified infrastructure, this work illustrates how DT concepts can be operationally implemented for natural risk applications, contributing methodological insights relevant to subsurface-related DT workflows focused on data integration and decision support. Although this contribution focuses on the event-driven DT cycle triggered by real earthquakes, the proposed framework also enables “what-if scenario” based impact assessments, illustrating the flexibility of the DT for both operational response and prospective risk analysis. 

How to cite: Negulescu, C., Gehl, P., Auclair, S., Bertil, D., Legendre, Y., Guidez, R., Ramambazafy, H., Chan Thaw, F., Gracianne, C., Hoste Colomer, R., Roulle, A., and Grandjean, G.: An Operational Earthquake Digital Twin Based on Empirical Ground-Motion Models and Period Estimation: Integration of SEISAID and B-Wave within the VIGIRISKS Platform, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23011, https://doi.org/10.5194/egusphere-egu26-23011, 2026.

Semantic Change Detection (SCD) plays a crucial role in understanding land surface dynamics, from urban expansion and deforestation to disaster impact assessment. However, despite the success of deep learning models in SCD tasks, their real-world deployment faces two critical challenges: significant domain shifts between geographically distinct regions and the prohibitive cost of data annotation for new locations. Models trained on public benchmarks, predominantly from developed countries, experience substantial performance degradation when applied to regions with different characteristics, such as Indian cities, due to inter-domain variance in sensors, atmospheric conditions, and landscapes. Additionally, substantial intra-domain variance within target regions compounds this problem, necessitating robust solutions that operate with limited labels.

To address these challenges, we propose SSLCD-Adapt, a novel hierarchical framework for label-efficient, cross-domain SCD that tackles inter-domain, intra-domain, and label constraints through a three-stage process. First, we employ Change-Enhanced Self-Supervised Pre-Training, where change representations are learned directly from unlabeled bi-temporal image pairs from the source domain using the FSC-180K benchmark dataset. Through the application of a Barlow Twins objective to fuse features from distorted views, the model learns invariant characteristics of change without manual annotation, providing superior initialization compared to ImageNet pre-trained models, which differ significantly from remote sensing imagery.

Second, Domain Alignment bridges the data distribution gap between the source (FSC-180K) and the target (six Indian cities) domains. The source encoder remains frozen while the target encoder undergoes training in only three layers within an adversarial setup. We employ a Domain-Adversarial Neural Network (DANN) that incorporates a Gradient Reversal Layer (GRL) with a Maximum Mean Discrepancy (MMD) loss to align feature distributions without requiring target labels. The domain classifier maximizes the H-divergence between the source and target domains, while GRL reverses the gradients, forcing target encoders to generate features similar to those of the source encoders, thereby achieving alignment in feature space and minimizing inter-domain variance.

Third, the trained target encoder undergoes Progressive Domain-specific fine-tuning using limited target labeled data. The encoder trains for one-third of the epochs on target data, then for two-thirds of the epochs using city-specific batches with domain-specific batch normalization for each city, effectively minimizing intra-domain variance between the six Indian cities. Figure 1 demonstrates the complete SSLCD-Adapt architecture.

Figure 1: Proposed SSLCD-Adapt Architecture

How to cite: Srivastava, N. and Jain, K.: SSLCD-ADAPT: A hierarchical framework for label-efficient cross-domain semantic change detection in complex environments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-548, https://doi.org/10.5194/egusphere-egu26-548, 2026.

EGU26-2530 | Posters on site | ESSI1.11

Efficient Earth Observation Representation Learning Using Metadata-Aware Mixture-of-Experts Masked Autoencoder 

Mohanad Albughdadi, Marica Antonacci, Vasileios Baousis, Federico Fornari, Tolga Kaprol, and Claudio Pisa

Large-scale foundation models trained on multi-sensor satellite imagery has been driving recent advances in Earth Observation (EO) tasks. Although such models achieve impressive transferability across diverse downstream tasks, their computational and memory demands hinder accessibility, reproducibility, and deployment in resource-constrained environments. This work explores a compact and efficient alternative, introducing a metadata-aware Mixture-of-Experts Masked Autoencoder (MoE-MAE) for EO representation learning (Albughdadi, 2025).

The proposed MoE-MAE is a self-supervised transformer-based architecture with only 2.5 million parameters. It combines sparse expert routing and geo-temporal conditioning. The sparse routing allows token specialization while keeping active computation low. The geo-temporal conditioning injects information about latitude, longitude, and cyclic temporal attributes directly into the model. The proposed design enables the algorithm to exploit spatial and temporal regularities inherent in EO data without requiring dense, and computationally costly transformers.

The model is pretrained in the BigEarthNet-Landsat (BEN-LS) (Corley et al., 2025) dataset using a masked reconstruction loss function augmented with auxiliary unmasked and load-balancing losses to encourage stable expert utilization. The learned encoder representations are then evaluated using linear probing on two benchmark datasets: (1) BEN-LS, a multi-label land-cover dataset with explicit metadata, and (2) EuroSAT-Landsat (EuroSAT-LS) (Corley et al., 2025), a single-label classification datasets without metadata. Despite the encoder’s small size (~2.3 M parameters), the proposed MoE-MAE achieves competitive results with models’ orders of magnitude larger. On BEN-LS, the frozen encoder achieves a micro mean average precision of 0.767, comparable to SSL4EO-L ViT-S/16 MoCo v2 (0.775) (Stewart et al., 2023). On EuroSAT-LS, the model maintains strong transferability, achieving 84.2% accuracy, even in the absence of geo-temporal metadata.

Expert specialization across spatial patterns is revealed through adequate ablation and visualization studies, which show that some experts respond primarily to vegetation, others to water or textured regions. This demonstrates interpretable behaviour and complementary feature learning. Additionally, only about half of the model’s expert feed-forward capacity is activated per token, confirming computational sparsity in practice. These findings suggests that such models can retain strong representational power while substantially reducing training and inference costs.

This work presents a first step toward small-scale architectures for EO representation learning by integrating metadata, and leveraging sparse computation to approach the performance of massive transformers. Future work will extend this framework to multi-sensor and multi-temporal datasets to capture dynamic Earth processes efficiently.

Albughdadi, M. (2025). Lightweight Metadata-Aware Mixture-of-Experts Masked Autoencoder for Earth Observation.arXiv:2509.10919.

Stewart, A. J., Lehmann, N., Corley, I. A., Wang, Y., Chang, Y.-C., Braham, N. A. A., Sehgal, S., Robinson, C., & Banerjee, A. (2023). SSL4EO-L: Datasets and Foundation Models for Landsat Imagery. arXiv:2312.05241.

Corley, I., Sharma, L., and Crasto, R. (2025). Landsat-Bench: Datasets and Benchmarks for Landsat Foundation Models.arXiv:2506.08780.

How to cite: Albughdadi, M., Antonacci, M., Baousis, V., Fornari, F., Kaprol, T., and Pisa, C.: Efficient Earth Observation Representation Learning Using Metadata-Aware Mixture-of-Experts Masked Autoencoder, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2530, https://doi.org/10.5194/egusphere-egu26-2530, 2026.

Recent progress in Earth Observation (EO) foundation models has raised expectations that large-scale pretraining will yield general-purpose representations comparable to those in natural language processing and computer vision. In this work, we show that this promise has not yet been realized. We introduce spectral stability as a principled criterion for foundation models, measuring the extent to which the principal singular subspaces of pretrained weights are preserved during fine-tuning. Through this lens, we conduct a comparative analysis of several EO foundation models, including AnySat and Presto, alongside established models from vision and language, namely DINOv2 and BERT. Our analysis reveals a stark contrast between domains. BERT and DINOv2 exhibit strong spectral stability, with fine-tuning primarily inducing rotations within a small low-rank subspace. In contrast, EO models display severe spectral instability, where fine-tuning substantially rewrites their dominant singular directions. We show that this instability explains two key limitations of current EO foundation models. First, pretraining does not consistently accelerate downstream learning. Second, low-rank adaptation methods such as LoRA can fail or collapse, as the pretrained subspaces are only partially useful. Using extensive experiments on the TimeMatch benchmark for cross-regional crop classification, we demonstrate that despite strong performance claims, pretrained EO models yield inconsistent or marginal improvements over random initialization and do not achieve state-of-the-art performance. These findings indicate that current EO models lack the representational universality characteristic of true foundation models. We conclude that spectral stability is a critical property for robust transfer learning in Earth Observation, and we argue that future EO foundation models should prioritize spectral coherence through improved pretraining objectives and architectural designs that better capture the underlying structure of geospatial data.

How to cite: Turkoglu, M. O. and Aasen, H.: Are Earth Observation Foundation Models Really Foundation Models? Investigation Based on Spectral Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2602, https://doi.org/10.5194/egusphere-egu26-2602, 2026.

EGU26-7472 | ECS | Posters on site | ESSI1.11

Habitat and Land Cover Change Detection in Alpine Protected Areas: A Comparison of AI Architectures 

Harald Kristen, Daniel Kulmer, and Manuela Hirschmugl

Effective protected area management requires frequent habitat monitoring to respond to rapid climate change and disturbances, yet traditional manual mapping methods cannot provide the temporal resolution needed for evidence-based policy decisions. We present a practical implementation of AI-driven change detection developed in collaboration with Gesaeuse National Park administration, Austria, to support operational habitat monitoring and management planning.

We address critical challenges in deploying AI technologies for complex environmental contexts: fuzzy class boundaries in natural habitats, highly imbalanced classes, and limited training data typical of protected areas. Using 15 years of high-resolution multimodal data (RGB, NIR, LiDAR, terrain attributes) covering 4,480 documented habitat changes across 15.3 km², we compare emerging geospatial foundation models (Clay v1.0, Prithvi-EO-2.0) against established U-Net architectures to identify the most robust approach for real-world application.

Results demonstrate that foundation models show superior cross-temporal robustness (Clay: 33% accuracy vs U-Net: 23% on unseen temporal data), a critical factor for operational monitoring systems. Integrating LiDAR improves detection accuracy from 30% to 50%. While overall accuracies are lower than in homogeneous agricultural landscapes, they reflect realistic performance for complex alpine environments and provide actionable information for park management.

To further enhance practical applicability for environmental agencies, we integrate object-based post-processing and physical constraints to filter misclassifications, making outputs directly usable for management decisions. This case study demonstrates practical strategies for implementing AI technologies in complex environmental monitoring contexts where traditional approaches face significant challenges. Building upon this work from the Habitalp 2.0 project, the BioDivAI project will extend these habitat mapping approaches to predict biodiversity impacts under various land use and land cover change scenarios, providing decision-makers with tools to assess trade-offs between economic activities and ecosystem protection.

How to cite: Kristen, H., Kulmer, D., and Hirschmugl, M.: Habitat and Land Cover Change Detection in Alpine Protected Areas: A Comparison of AI Architectures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7472, https://doi.org/10.5194/egusphere-egu26-7472, 2026.

EGU26-7697 | ECS | Orals | ESSI1.11

Gap-Aware Transformer-Based Foundation Model Pretraining for Spatiotemporal Earth Observation Data 

Charly Zimmer, Josefine Umlauft, Guido Kraemer, David Montero, and Miguel D Mahecha

Earth observation datasets, especially those derived from remote sensing, are often characterized by significant data gaps. However, the pretraining of Geospatial Foundation Models requires mostly complete samples, leading to very selective sampling strategies that leave out large parts of the original observations. The problem is exacerbated in spatiotemporal data where these restrictions apply to the entire time series. Systems like Prithvi-EO-2.0 allow very small gap regions that can be addressed with interpolation during preprocessing. But a solution for integrating significant gap areas (>20% of the sample) into the pretraining process is yet to be established. We introduce an architecture that builds upon the random masking strategies in popular MAE-style architectures by additionally force-masking patches that contain gaps. Doing so requires a BERT masking scheme where masked patches are encoded instead of being removed from the sequence. Custom loss functions are introduced to account for the gaps in both the targets and the masked patches. While the resulting encoder-only architecture does not benefit from the reduced computational complexity in MAE-style masking, we mitigate this effect by using factorized space-time attention in the Video Vision Transformer (ViViT) backbone, thus creating a simple and lightweight model that is easily scalable. We demonstrate the potential of the architecture by performing spatiotemporal representation learning in a multivariate setup involving global Land Surface Temperature (LST) observations. The model is embedded in a framework that provides customizable sampling strategies for large-scale Earth observation datasets, including control over parameters like the maximum gap ratio per sample, the sampling strides, and the involved variables in shared-grid datasets like Earth System Data Cubes (ESDC). This flexibility in sampling enables the generation of training datasets with millions of samples, thus exposing the full volume of information stored in Earth observation data to Geospatial Foundation Models.

How to cite: Zimmer, C., Umlauft, J., Kraemer, G., Montero, D., and Mahecha, M. D.: Gap-Aware Transformer-Based Foundation Model Pretraining for Spatiotemporal Earth Observation Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7697, https://doi.org/10.5194/egusphere-egu26-7697, 2026.

Accurate and scalable mapping of mariculture facilities is essential for coastal resource management, environmental monitoring, and sustainable aquaculture development. However, existing remote sensing–based segmentation approaches heavily rely on large amounts of annotated data or manual interaction, limiting their scalability and generalization. Recently, foundation models such as the Segment Anything Model (SAM) have demonstrated strong generalization ability across diverse visual domains. Nevertheless, SAM’s performance in remote sensing applications remains constrained by its reliance on manually selected prompts, which is impractical for large-scale or automated mapping tasks.

In this study, we propose an AutoPrompt-enhanced SAM framework (AutoPrompt-SAM) for the automated segmentation of mariculture facilities, specifically floating rafts and net cages, from high-resolution PlanetScope imagery. The proposed framework eliminates the need for human-provided prompts by introducing an AutoPrompt module that automatically generates high-quality point prompts for SAM, enabling prompt-free semantic segmentation in a fully automated manner.

As a foundation for this work, we construct a large-scale, high-quality mariculture facility segmentation dataset consisting of more than 1,000 manually annotated PlanetScope image patches with a spatial resolution of 3 m. Each sample is cropped to 256 × 256 pixels and includes pixel-level labels for floating rafts, net cages, and background. To the best of our knowledge, this dataset represents one of the first publicly usable high-resolution semantic segmentation benchmarks for mariculture facilities based on PlanetScope imagery.

The proposed AutoPrompt module learns to generate representative prompt points directly from image features, without requiring any human interaction during inference. These automatically generated prompts are then fed into SAM to produce segmentation masks. By leveraging SAM’s powerful pre-trained visual representations, our method effectively combines the generalization capability of foundation models with task-specific structural cues learned by the AutoPrompt module. Experimental results demonstrate that AutoPrompt-SAM achieves competitive performance compared with manually prompted SAM, while completely removing the need for human intervention.

Beyond mariculture mapping, we further investigate the transferability of the proposed framework. Without additional labeled data, AutoPrompt-SAM shows strong generalization performance when applied to other remote sensing segmentation scenarios, indicating that the learned prompt generation strategy captures transferable spatial and structural patterns. This highlights the potential of AutoPrompt-SAM as a label-efficient and domain-adaptive segmentation framework, capable of extending SAM to broader remote sensing applications.

Overall, this work makes three key contributions: (1) the construction of a large-scale, high-resolution PlanetScope mariculture facility segmentation dataset; (2) the proposal of an AutoPrompt-driven SAM framework that enables fully automated, prompt-free semantic segmentation while effectively exploiting SAM’s pre-trained knowledge; and (3) a demonstration of the framework’s strong transferability, offering a new pathway for reducing human intervention and annotation dependency in remote sensing segmentation tasks. The proposed approach provides a practical solution for adapting foundation models to large-scale Earth observation applications and paves the way toward more autonomous and scalable remote sensing analysis.

How to cite: Xu, Y. and Lu, L.: Towards Prompt-Free Segmentation of Mariculture Facilities Using an AutoPrompt-Enhanced Segment Anything Model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9080, https://doi.org/10.5194/egusphere-egu26-9080, 2026.

EGU26-10800 | ECS | Orals | ESSI1.11

COP-GEN: Stochastic Generative Modelling of Copernicus Data 

Miguel Espinosa, Eva Gmelich Meijling, Valerio Marsocci, Elliot J. Crowley, and Mikolaj Czerkawski

The work presented herein showcases early results of COP-GEN, a general-purpose diffusion model supporting flexible zero-shot translation between a number of popular data modalities related to the Copernicus programme: Sentinel-2 (both L1C and L2A), Sentinel-1 RTC, Copernicus DEM-30, Land Use Land Cover Maps, Cloud Masks, geospatial coordinates, and timestamps.

COP-GEN is designed as a diffusion model with a transformer backbone, which offers two concrete advantages. Firstly, the diffusion formulation respects the stochastic nature of cross-modal translation tasks; so that nearly every conditional generation query can be satisfied by a diverse range of plausible outputs rather than a single deterministic sample. Secondly, the sequence-based architecture facilitates the integration of diverse data modalities by flattening their latent representations, along with modality-specific diffusion timesteps, into a single sequence of tokens. Consequently, COP-GEN is capable of synthesising missing data from any subset of modalities in a zero-shot manner.

The model is pre-trained at global scale on MajorTOM, using over one million paired, geographically distributed samples spanning diverse climate zones, land-cover types, and acquisition conditions. By training jointly on matched data modalities, COP-GEN can, for example, estimate Land Use Land Cover, cloud coverage, atmospheric correction, and the spatiotemporal context of the available observations.

The first set of results indicates strong generative capability and high output diversity across modalities. The work concludes by discussing the available open-source implementation along with potential use cases.

How to cite: Espinosa, M., Gmelich Meijling, E., Marsocci, V., Crowley, E. J., and Czerkawski, M.: COP-GEN: Stochastic Generative Modelling of Copernicus Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10800, https://doi.org/10.5194/egusphere-egu26-10800, 2026.

EGU26-11247 | ECS | Posters on site | ESSI1.11

Forest Disturbance Monitoring with Geospatial Foundation Models 

Damien Robert and Jan Dirk Wegner

The growing availability of Earth Observation (EO) data enables monitoring of terrestrial ecosystems at unprecedented spatio-temporal resolutions. In practice, however, effective use of EO data remains constrained by substantial technical barriers. Working with raw, multi-modal EO imagery requires specialised domain expertise, large data transfers, access to high-performance computing infrastructure, and advanced machine learning (ML) skills. These requirements limit the accessibility of EO-based analytics for many downstream applications.

Geospatial foundation models (GFMs) provide a promising alternative by learning general-purpose representations from large volumes of unlabeled EO data. By decoupling representation learning from downstream task modelling, GFMs allow users to exploit expressive features from modern deep learning models with limited EO or deep learning expertise and modest computational resources.

In this work, we investigate embedding-based GFM workflows for forest disturbance monitoring, where timely inference and regional customisation are often more critical than maximising absolute predictive accuracy. Forest disturbances such as logging, windthrows, fires, pests, and diseases can occur abruptly and require rapid detection to support conservation, policy-making, and risk-management efforts.

Machine learning methods for forest disturbance detection from EO data are well established and have shown strong performance on regional benchmarks. However, much of this work remains confined to academic demonstrations and is rarely translated into operational monitoring systems. Existing forest monitoring tools, including those aggregated by Global Forest Watch, typically rely on region- and sensor-specific models with limited feature expressivity. These systems may benefit from the rich multi-modal and spatio-temporal representations learned by GFMs, provided such embeddings can be accessed through scalable and practical deployment pipelines.

We build on a pipeline designed to deliver on-demand, location- and time-specific geospatial embeddings as a service. Embeddings are generated server-side from raw EO data, compressed, and distributed as lightweight representations. End users interact only with these embeddings, which can be analysed using simple models such as linear probes or small decoders. This approach removes the need for the user to manipulate raw EO data, download large multi-modal datasets, or train and deploy large deep learning models, enabling rapid adaptation to local contexts with limited annotations and modest computational resources.

We present preliminary results demonstrating the feasibility of this approach for forest disturbance detection and discuss its strengths and limitations relative to bespoke, fully supervised image-based models. While GFMs may not be optimal for applications with abundant annotations and stringent accuracy requirements, embedding-based services are particularly well suited to time-sensitive and regionally adaptive monitoring scenarios. Overall, this work illustrates how releasing geospatial embeddings as a product or service can lower barriers to EO-based forest monitoring and support faster, more inclusive environmental decision-making.

How to cite: Robert, D. and Wegner, J. D.: Forest Disturbance Monitoring with Geospatial Foundation Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11247, https://doi.org/10.5194/egusphere-egu26-11247, 2026.

EGU26-11394 | ECS | Posters on site | ESSI1.11

HiD-FM: A High-Resolution Remote Sensing Foundation Model with Knowledge Distillation and Feature Fusion for Image Semantic Segmentation 

Guosen Xu, Huanfeng Shen, Xinghua Li, Mingjie Xu, Dekun Lin, and Tao Jiang

The emergence of foundation models marks a transformative era in Earth observation, delivering powerful and adaptable tools to tackle the complexities of processing massive satellite imagery. Currently, land cover mapping faces two primary obstacles: 1) the prohibitive cost and high reliance on high-quality labels for data annotation; and 2) the significant spectral and spatial variability of identical ground objects caused by differences in temporal phases, locations, and sensors. Visual Foundation Models (VFMs), with their potent generalization capabilities, offer a means to effectively bridge the domain gap. Inspired by it, a high-resolution remote sensing foundation model leveraging knowledge distillation and feature fusion HiD-FM is proposed. Specifically, HiD-FM undergoes self-supervised pre-training on a dataset of one million high-resolution unlabeled images. By synergizing knowledge distillation with feature fusion, it integrates the generalization power of pre-trained VFMs into a semi-supervised learning framework, thereby boosting performance on unlabeled data and enhancing fine-grained feature representation. Extensive experiments on semantic segmentation tasks demonstrate that HiD-FM consistently outperforms some RSFMs (such as RVSA, SMLFR and CMID), particularly in data-scarce scenarios. On the LoveDA and GID-15 datasets, our method surpasses both specialized models and existing foundation models across various labeling ratios. Notably, using only 30% of the training data, HiD-FM achieved OA of 83.19% on the GID-15 dataset. Furthermore, transfer learning experiments on GF-2 imagery across diverse spatiotemporal contexts yielded superior visualization results. HiD-FM enables rapid and cost-effective adaptation to target domains, thereby significantly advancing the field of remote sensing interpretation.

How to cite: Xu, G., Shen, H., Li, X., Xu, M., Lin, D., and Jiang, T.: HiD-FM: A High-Resolution Remote Sensing Foundation Model with Knowledge Distillation and Feature Fusion for Image Semantic Segmentation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11394, https://doi.org/10.5194/egusphere-egu26-11394, 2026.

EGU26-11672 | ECS | Posters on site | ESSI1.11

Sentinel-2 Image Retrieval with Global, Cross-modal Embeddings 

Yijie Zheng, Weijie Wu, Bingyue Wu, Guoqing Li, Mikolaj Czerkawski, and Konstantin Klemmer

Recent advancements in Earth embeddings have opened up new frontiers for geosciences, enabling efficient analysis of vast volumes of geospatial data. However, the practical utilization of these embeddings is often hindered by complex software environments and the requirement for specialized computational expertise. To help democratize access to Earth embeddings, we introduce EarthEmbeddingExplorer, an open-source, web-based application designed to enhance the accessibility, understanding and interactivity of Earth embeddings for the broader geoscience community. 

EarthEmbeddingExplorer integrates multiple state-of-the-art foundation models, including SatCLIP, FarSLIP, and SigLIP, to support cross-modal retrieval of Sentinel-2 imagery via text, image, and geographic location queries. Our implementation leverages the MajorTOM Core-S2L2A dataset as the primary data source; we pre-computed approximately 250,000 embeddings per model based on a uniform spatial sampling of the MajorTOM grid. This approach ensures a representative global coverage of 1.2% of the Earth's land surface. To ensure accessibility, all models and datasets are hosted on open-source frameworks, specifically ModelScope and Hugging Face. The application provides an intuitive interface for visualizing the geographical distribution of the retrieved results, rendering top-match thumbnails, and exporting comprehensive metadata. Such transparent and low-cost access to large-scale embedding analysis is essential for identifying model-specific advantages and limitations. By enabling instant cross-model comparisons within specific spatiotemporal contexts, EarthEmbeddingExplorer allows users to evaluate model performance for their unique monitoring needs and domains of interest.

Ongoing development focuses on expanding EarthEmbeddingExplorer’s capabilities by integrating additional embedding models such as DINOv2, and increasing global spatial coverage. We are further implementing FAISS-based vector similarity search to enable near-instantaneous queries across tens of millions of global embeddings. Future iterations will prioritize modular software architecture, standardized APIs, and detailed documentation to facilitate community-driven contributions of new embedding models and datasets. The web applications are accessible at https://huggingface.co/spaces/ML4Sustain/EarthExplorer and at https://www.modelscope.cn/studios/VoyagerX/EarthExplorer.

How to cite: Zheng, Y., Wu, W., Wu, B., Li, G., Czerkawski, M., and Klemmer, K.: Sentinel-2 Image Retrieval with Global, Cross-modal Embeddings, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11672, https://doi.org/10.5194/egusphere-egu26-11672, 2026.

EGU26-12736 | ECS | Orals | ESSI1.11

MMEarth-Bench: Global Environmental Tasks for Multimodal Geospatial Models 

Lucia Gordon, Serge Belongie, Christian Igel, and Nico Lang

Recent research in geospatial machine learning has demonstrated that models pretrained with self-supervised learning on Earth observation data can perform well on downstream tasks with limited training data. However, most of the existing geospatial benchmark datasets have few data modalities and poor global representation, limiting the ability to evaluate multimodal pretrained models at global scales. To fill this gap, we introduce MMEarth-Bench, a collection of five new multimodal downstream tasks with 12 input modalities, globally distributed data, and both in- and out-of-distribution test splits. We benchmark a diverse set of pretrained models on MMEarth-Bench and find that multimodal models generally perform best. While pretraining tends to improve model robustness in limited data settings, geographic generalization abilities remain poor and using multimodal inputs at test time can sometimes lead to geographic overfitting. In order to facilitate model adaptation to new downstream tasks and geographic domains, we propose a model-agnostic method for test-time training with multimodal reconstruction (TTT-MMR) that uses all the modalities available at test time, regardless of whether the pretrained model accepts them as input. We show that TTT-MMR improves model performance on both random and geographic test splits, and that geographic batching (TTT-MMR-Geo) leads to a good trade-off between regularization and specialization during TTT. Our dataset, code, and visualization tool are linked from the project page at https://lgordon99.github.io/mmearth-bench.

How to cite: Gordon, L., Belongie, S., Igel, C., and Lang, N.: MMEarth-Bench: Global Environmental Tasks for Multimodal Geospatial Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12736, https://doi.org/10.5194/egusphere-egu26-12736, 2026.

EGU26-12871 | ECS | Posters on site | ESSI1.11

FarSLIP: A Vision-Language Foundation Model for Fine-Grained Remote Sensing Understanding 

Zhenshi Li, Xueliang Zhang, Pengfeng Xiao, and XiaoXiang Zhu

Vision-language foundation models (VLFMs), such as CLIP, have demonstrated remarkable generalizability across diverse downstream tasks, including both cross-modal and vision-centric tasks. Leveraging large-scale textual supervision, VLFMs capture a broad spectrum of visual concepts and achieve breakthrough performance in zero-shot image understanding. However, current remote sensing (RS)-specific VLFMs, while performing well on image-level tasks, exhibit limited capability in fine-grained tasks such as open-vocabulary semantic segmentation (OVSS). This limitation stems from their adherence to the CLIP training paradigm, which aligns image and text features only at the global level, thereby degrading performance in tasks requiring high-quality visual representations at local level. Moreover, existing VLFMs that incorporate fine-grained alignment mechanisms still exhibit limited performance on remote sensing tasks, whether through direct transfer to RS scenarios or fine-tuning on RS image-caption datasets. This further underscores the need for developing RS-tailored fine-grained VLFMs.

To address this, we construct the first multi-granularity RS image-text dataset, MGRS-200k (Figure 1). MGRS-200k contains approximately 200k RS images, each annotated with both short and long global captions, as well as multiple object-level bounding boxes with corresponding categories, totaling over one million instances. We further investigate existing fine-grained VLFM training methods and find that their explicit region-text alignment strategies often disrupt semantic coherence, as their underlying assumptions do not hold in RS scenarios,  and thus degrade fine-grained understanding.

Building on these, we propose FarSLIP, a Fine-grained Aligned RS Language-Image Pretraining framework (Figure 2). FarSLIP first employs patch-to-patch self-distillation to align local and global visual cues, enhancing feature discriminability while preserving semantic coherence. It then applies CLS token-based region-category alignment using the MGRS-200k dataset to further improve spatial awareness. FarSLIP achieves state-of-the-art performance in zero-shot RS image understanding, excelling not only on image-level tasks such as scene classification and image-text retrieval, but more importantly on fine-grained tasks like OVSS. Additionally, it serves as a strong foundation for multimodal large language models (MLLMs) in RS image comprehension.

Figure 1.  Examples of our proposed MGRS-200k dataset. 

Figure 2. Overall rchitecture of FarSLIP. The model is trained in a two-stage manner. In Stage I, FarSLIP is optimized with image-caption alignment and patch-to-patch self-distillation. In Stage II, image-caption alignment and region-category alignment are jointly employed on the MGRS-200k dataset.

How to cite: Li, Z., Zhang, X., Xiao, P., and Zhu, X.: FarSLIP: A Vision-Language Foundation Model for Fine-Grained Remote Sensing Understanding, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12871, https://doi.org/10.5194/egusphere-egu26-12871, 2026.

EGU26-13425 | Orals | ESSI1.11

Time, Space, and Nightlights: Global Evaluation of Major TOM Earth Embeddings 

Mikolaj Czerkawski, Marcin Kluczek, and Jędrzej S. Bojanowski

The current landscape of geospatial AI models is expanding rapidly, with new open-source models being released nearly every month. Consequently, with this rising number of potential general-purpose models, many of which claim state-of-the-art performance, their suitability for specific tasks or spatiotemporal contexts is often difficult to judge. A key step towards the democratisation of model benchmarking can be made by releasing large-scale datasets of pre-computed embeddings, such as those shared within the Major TOM project.

Yet, even with easy access to global and dense embeddings of a given model, it is not clear how to evaluate them on a global scale, given the scarcity and spatiotemporal biases of high-quality labels. This work explores a set of evaluation tests that can be conducted on a global scale, moving beyond canonical use cases to understand the inherent biases of individual models.

First, a set of proxy tasks with worldwide coverage is introduced. In this benchmark prototype, several sensitivity variables are tested, including time, location (estimation of spatiotemporal context), and VIIRS nightlights data (estimating a proxy for human activity). Despite not being traditional downstream tasks, these three variables have the advantage of uniform quality across the entire dataset. This allows for standardised, fair evaluation of representations extracted from Sentinel-2 and Sentinel-1 data across a range of pre-trained encoders as part of the Major TOM Embedding suite.

Secondly, a suite of techniques for comparing internal representation geometries of latent space vectors from multiple models is introduced to evaluate the similarities and differences between individual models. This approach does not require any reference labels, enabling a deeper understanding of geospatial semantic relationships encoded by different architectures.

Ultimately, this work advances the large-scale evaluation of deep learning models for Earth observation data, utilizing these model comparisons to develop a set of recommendations for future benchmarking efforts within the Earth Science community.

How to cite: Czerkawski, M., Kluczek, M., and Bojanowski, J. S.: Time, Space, and Nightlights: Global Evaluation of Major TOM Earth Embeddings, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13425, https://doi.org/10.5194/egusphere-egu26-13425, 2026.

EGU26-13546 | ECS | Posters on site | ESSI1.11

Addressing Geographical Domain Shift in Tree Species Mapping via Foundation Models using Satellite Image Time Series 

Sarah Brood, Iris Dumeur, Jérémy Anger, Aurélien de Truchis, Ewan Sean, Ibrahim Fayad, Alexandre d'Aspremont, and Philippe Ciais


In a context of rapid environmental change, delivering robust tree species mapping is essential.  It enables better quantification of forest biomass, facilitates climate change adaptation through better forest management and supports biodiversity preservation. However, the scarce existing ground-truth datasets suffer from geographic sparsity, semantic inconsistencies and class imbalance, making current methods overfit to context and unsuitable for accurate large-scale tree species mapping. Therefore, it is imperative to design methods that learn spatially invariant representations for tree species mapping.

The surge of Earth Observation missions has unlocked vast amounts of Satellite Image Time Series (SITS) which capture phenology and spectral dynamics that are an asset for tree species classification. Leveraging this data, an increasing number of Foundation Models (FM) pre-trained using Self-Supervised Learning (SSL) have been introduced.  Yet, due to the prevalence of patch-level annotations in tree species datasets,  FMs are primarily evaluated on classification tasks instead of segmentation, preventing the production of pixel-level maps. Furthermore, spatial generalization remains largely unexplored, partially explained by the geographic sparsity of the labels. As a result, current models often overfit to local context: they perform well on training areas but fail to generalize to new spatial domains.  Therefore this work focuses on rigorous spatial generalization evaluation and the development of methods to produce large-scale pixel-level tree species maps overcoming current spatial domain shifts. 

To quantify this generalization gap, we propose a spatial zero-shot domain adaptation evaluation protocol, where frozen FMs are linearly probed through a segmentation task on a geographical region and tested on geographically distinct, unseen regions. We aligned 3 datasets in Europe (TreeSatAI, PureForest and a regional dataset covering Poland) into 6 classes to benchmark state-of-the-art FMs (AnySat, ALISE, Presto) pre-trained on SITS and introduce a new architecture addressing current limitations.
We propose a SSL framework based on the TimeSFormer backbone. It captures complex spatio-temporal dynamics using divided space and time attention. The model is pre-trained as a Masked Auto-Encoder on a European-scale unlabeled Sentinel-2 dataset to learn robust phenological features. To mitigate the observed spatial generalization gap, we investigate different strategies such as auxiliary conditioning and thermal temporal positional encoding. 

Our evaluation protocol reveals a significant accuracy drop of state-of-the-art models when applied to unseen regions. This decline suggests that current FMs capture geographically-dependent features rather than intrinsic tree species characteristics, resulting in a spatial generalization gap. 
Experiments confirm that the proposed architecture learns semantically rich features, evidenced by its high capacity to reconstruct missing time steps of satellite time series.  

By quantifying the spatial domain shift, proposing a resilient SSL architecture and applying domain adaptation strategies this work addresses the important challenge of generalization in label-scarce regimes. It supports high-resolution forest monitoring, a prerequisite for precise carbon accounting and forest biodiversity conservation.

How to cite: Brood, S., Dumeur, I., Anger, J., de Truchis, A., Sean, E., Fayad, I., d'Aspremont, A., and Ciais, P.: Addressing Geographical Domain Shift in Tree Species Mapping via Foundation Models using Satellite Image Time Series, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13546, https://doi.org/10.5194/egusphere-egu26-13546, 2026.

EGU26-14376 | ECS | Posters on site | ESSI1.11

Similarity Search of Earth Observation Data Using Foundation Model Embeddings 

Bartosz Augustyn, Marcin Kluczek, Jedrzej Bojanowski, and Mikolaj Czerkawski

Foundation Models enable rich semantic representations of Earth Observation data by using embeddings generated from large, heterogeneous, and often unlabeled datasets. One of their most impactful applications is semantic similarity search, which allows EO data discovery based on context and meaning rather than metadata alone.  

This work presents global EO embedding datasets deployed within the Copernicus Data Space Ecosystem (CDSE), enabling large-scale semantic and similarity search across satellite imagery. The embeddings are generated using multimodal Foundation Models that map EO imagery and textual queries into a shared space, allowing natural language to retrieve semantically related observations. This approach supports the discovery of complex geospatial patterns such as land cover types, human activities, or environmental phenomena without explicit labeling.  

To ensure global consistency and scalability, the embedding generation and indexing are supported by the Major TOM standard, which provides a unified geospatial reference framework based on a global grid of points. Major TOM enables sampling across EO missions avoiding destructive preprocessing thus preserving raw, undistorted pixel values.  

Efficient similarity search over tens of millions of high-dimensional embeddings is achieved through FAISS vector indexing techniques, enabling immediate query results for global scale datasets. Foundation Model embeddings, combined with standardized geospatial indexing and high-performance vector search, form a practical and scalable foundation for next-generation EO data discovery. 

 

How to cite: Augustyn, B., Kluczek, M., Bojanowski, J., and Czerkawski, M.: Similarity Search of Earth Observation Data Using Foundation Model Embeddings, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14376, https://doi.org/10.5194/egusphere-egu26-14376, 2026.

EGU26-18737 | ECS | Posters on site | ESSI1.11

TerraMind vs. THOR: A comparative analysis of ESA’s Geospatial Foundation Models 

Eva Gmelich Meijling, Valerio Marsocci, Frederick Schindlegger, Kenzo Bounegta, and Nicolas Longepe

This study presents a comparative analysis of two diverse Geospatial Foundation Models (GFMs) developed by consortia under the European Space Agency (ESA): THOR and TerraMind. THOR introduces a compute-adaptive architecture designed to handle heterogeneous sensors and variable patch sizes. This enables flexible compute–accuracy trade-offs and high performance in limited training data regimes. It is also the first GFM to extensively include Sentinel-1, -2, and -3 data. TerraMind, in contrast, is a multimodal GFM with both discriminative and generative capabilities, pretrained with a dual‑scale scheme that fuses token‑level context and pixel‑level detail, enabling any‑to‑any cross‑modal generation and Thinking‑in‑Modalities (TiM) to infer missing modalities during fine‑tuning and inference. The cross-comparison, aimed to understand the level of maturity of European technologies in AI4EO, covers a collection of Earth Observation use cases provided by the two consortia, encompassing several tasks (segmentation, change detection, and classification), across diverse and overlooked domains, including climate disaster analysis, methane leak detection, forest biomass monitoring, and sea ice mapping. To ensure consistent preprocessing and evaluation of the two models and use cases, we benchmarked them in two very widespread and acknowledged framework: PANGAEA and TerraTorch. The analysis focuses on task coverage, architectural capabilities, and performance metrics, highlighting differences in adaptability, modality integration, and downstream application effectiveness. Results provide insights into the strengths and limitations of current GFMs for various scenarios, making it possible to grasp insights on different GFMs approaches, not limited to THOR and TerraMind.

How to cite: Gmelich Meijling, E., Marsocci, V., Schindlegger, F., Bounegta, K., and Longepe, N.: TerraMind vs. THOR: A comparative analysis of ESA’s Geospatial Foundation Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18737, https://doi.org/10.5194/egusphere-egu26-18737, 2026.

EGU26-19446 | ECS | Posters on site | ESSI1.11

A Geometry-Aware Multi-Task Framework for Parcel Delineation with Geospatial Foundation Models  

Rim Sleimi, Joao Vinholi, Florian Werner, and Albert Abelló

Field boundary delineation (FBD) is a foundational task in Earth Observation (EO), supporting a wide range of agricultural and environmental applications. Accurate, parcel-level boundaries enable field-level reporting, water productivity monitoring, and scalable, decision-support systems. However, extracting reliable field geometries from medium-resolution satellite imagery remains challenging, particularly at 10 m resolution where boundaries are thin, low-contrast, and often visually ambiguous. Adjacent parcels can appear similar; supervision data is frequently sparse or inconsistent across regions, and agricultural practices vary widely introducing domain shifts that undermine generalizability. These factors make naïve “extent-only” approaches prone to merging neighboring fields, while “boundary-only” methods often fail to produce closed, stable instances when separators are weak or missing. Geospatial foundation models (FMs), pre-trained on large, multi-modal satellite archives, offer a promising solution by enabling transferable visual representations for EO tasks with limited supervision. Yet their application to geometry-sensitive tasks like FBD remains, to the best of our knowledge, unexplored. 

This work presents a boundary-centric field delineation pipeline that demonstrates one of the first operational deployments of geospatial FMs for parcel mapping using Sentinel-2 imagery. At its core, the model leverages TerraMind, a modality-aware, self-supervised EO foundation model, as the feature encoder. This FM backbone enables the system to learn transferable, generic spatial representations from large-scale EO data. To enhance generalization across regions and seasons, the encoder is explicitly conditioned in both time and space. Temporal context is provided through a Day-of-Year (DOY) sinusoidal embedding, capturing phenological variability and seasonal appearance shifts across acquisitions. Spatial context is introduced via SatCLIP-based coordinate embeddings, which transform geographic patch-center coordinates into rich, location-aware priors using a frozen SatCLIP backbone and lightweight projection.  

Built atop the TerraMind feature hierarchy is a Fractal ResUNet-style decoder that reconstructs fine boundary details while preserving global parcel topology. Operating over a multi-scale pyramidal representation, the decoder reshapes latent token embeddings into spatial maps and progressively upsamples them through skip-connected blocks. This design effectively balances fine-grained localization and broad contextual reasoning—essential at 10m resolution where boundaries are thin and adjacent parcels are visually similar. The model produces three interrelated outputs through a coupled multi-task formulation: a probability map for field extent, a boundary likelihood map capturing separator ridges, and a continuous distance-to-boundary field that encodes interiorness. These outputs are supervised jointly, encouraging geometric coherence across predictions.  

To quantify performance, we evaluate delineation quality on a multi-country European validation set built from parcel-level labels RapidCrops. Across countries, the model reaches boundary and extent IoU in the ~0.75–0.91 range, with higher scores in landscapes dominated by larger, well-separated parcels and lower scores in regions characterized by small fields, weak visual separators, or incomplete ground truth. This variability highlights both the scalability enabled by FM features and the remaining performance ceiling imposed by 10 m resolution and label quality. 

How to cite: Sleimi, R., Vinholi, J., Werner, F., and Abelló, A.: A Geometry-Aware Multi-Task Framework for Parcel Delineation with Geospatial Foundation Models , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19446, https://doi.org/10.5194/egusphere-egu26-19446, 2026.

EGU26-19581 | ECS | Posters on site | ESSI1.11

Foundation versus Task-Specific Models for Environmental Time-Series Prediction 

Anna Luise von Blohn, Miguel Mahecha, and Julia Peters

Machine learning models for Earth system prediction tasks differ substantially in their training strategy and the type of context they encode. Task-specific models are trained from scratch using a limited set of variables assumed to directly influence the prediction target. These models lack broad spatial and cross-variable context. In contrast, Earth system foundation models are pre-trained on large and heterogeneous data sets and are expected to capture richer environmental context that can be transferred to downstream prediction tasks.

In other machine learning domains, such as natural language processing, fine-tuning pre-trained foundation models has become standard practice due to consistent performance gains over models trained from scratch. Whether similar benefits arise for Earth system time-series prediction tasks remains unclear.

To address this gap, we compare task-specific transformer encoder models operating on pixel-level time series with fine-tuned Earth system foundation models across a set of time-series prediction tasks describing vegetation response to environmental change, including Gross Primary Productivity. This comparison isolates the effect of pre-training on predictive performance by keeping the prediction targets fixed. 

Our aim is to determine which modelling approach yields higher predictive accuracy for environmental time-series analyses.

How to cite: von Blohn, A. L., Mahecha, M., and Peters, J.: Foundation versus Task-Specific Models for Environmental Time-Series Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19581, https://doi.org/10.5194/egusphere-egu26-19581, 2026.

EGU26-19720 | Orals | ESSI1.11

DeepFeatures: Learning Latent Representations from Spectral Indices for Ecosystem Monitoring 

Karin Mora, Julia Peters, Konstantin Ntokas, Martin Reinhardt, Gunnar Brandt, Teja Kattenborn, Guido Kraemer, David Montero, Clemens Mosig, and Miguel D. Mahecha

Monitoring and understanding Earth system dynamics and their response to climate change and human activity requires innovative approaches to analyse complex and multivariate remote sensing data. However, the current trend is towards large models that require a lot of memory and computational power to be trained. The DeepFeatures project addresses this challenge by developing an embedding approach to create Feature Data Cubes, which capture the underlying spatio-temporal ecosystem dynamics as a low dimensional representation in latent space. These reduced representations enable the use of simpler, resource-efficient downstream models, which are easier to train and require minimal computational resources.

Specifically, the project builds on the rationale that each spectral index (SI), which is calculated from spectral bands and represents certain surface properties such as vegetation greenness,  reflects a specific aspect of ecosystem behaviour. Despite the development of over two hundred spectral indices, current studies often narrow their focus to individual SIs, overlooking the broader context of land surface processes represented by not considered SIs. The DeepFeatures project addresses this challenge by adopting a spatio-temporal multivariate approach. The SIs are derived from Sentinel-2 observations to generate a SI Data Cube. A deep learning embedding algorithm is applied to reduce the SI dimension and extract a latent space to create the Feature Data Cubes.

To demonstrate the potential of the Feature Data Cubes, the project focuses on inference across a range of scientific applications, including modelling gross primary production, analysing  tree mortality and greening trends, biodiversity  monitoring for conservation, comparing phenological features using satellite and crowd-sourced data, and studying the ecological impacts of open-pit lignite mining.

DeepFeatures emphasises the deployment of transparent and reproducible workflows, from generating Sentinel-2 derived Training Data Cubes to creating Feature Data Cubes. It aims to have an accessible, extensible, and modifiable framework for diverse applications, fostering broad community engagement and enabling open exploration of Earth system dynamics.

This presentation will showcase the methodology, scientific cases, and transformative potential of the DeepFeatures framework, highlighting its contributions to Earth observation and climate research.

The project DeepFeatures is funded by ESA’s AI4Science activity. Website: https://rsc4earth.de/project/deepfeatures/ 

How to cite: Mora, K., Peters, J., Ntokas, K., Reinhardt, M., Brandt, G., Kattenborn, T., Kraemer, G., Montero, D., Mosig, C., and Mahecha, M. D.: DeepFeatures: Learning Latent Representations from Spectral Indices for Ecosystem Monitoring, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19720, https://doi.org/10.5194/egusphere-egu26-19720, 2026.

EGU26-21700 | ECS | Orals | ESSI1.11

Segmentation Model Benchmarking: A Strategic Prerequisite for Robust Geospatial Foundation Models 

Mehran Alizadeh Pirbasti, Gavin McArdle, and Vahid Akbari

Pretraining geospatial foundation models (FMs) is expensive, and architectural choices control inductive bias for multiscale context, cross-resolution behavior, and band/sensor variation. Therefore, benchmarking reduces the risk of scaling the “wrong” base. A model benchmarking framework for geospatial image segmentation is a critical prerequisite for developing robust and scalable geospatial FMs. In the emerging era of Earth observation FMs, success hinges on strong, well-characterized base architectures that can generalize across sensors, modalities, and geographies. The extreme heterogeneity of Earth observation vision data (different spectral bands, resolutions, and regions) makes such generalization especially challenging, underscoring the need for systematic, controlled benchmarking across diverse model families to identify viable architectures for different scenarios.
Our work rigorously evaluates a broad spectrum of segmentation architectures and backbones under consistent conditions. We benchmark classical convolutional architectures (U-Net, DeepLab, UPerNet, FPN, PAN, and LinkNet) alongside modern transformer-based models (Dense Prediction Transformer (DPT) and SegFormer). For this comparison, we use representative backbones from both CNNs (ResNet and MobileNet) and Mix Vision Transformer (MiT). By comparing these heterogeneous models on equal footing, we determine which architectural patterns and hybrid combinations yield representations most conducive to generalization. This diversity in evaluation identifies well-founded architectural bases for geospatial FMs.
To guide architecture selection and pipeline design, we deploy a comprehensive suite of metrics covering both accuracy and efficiency. We evaluate segmentation accuracy via IoU, Dice, and boundary F1-score, and also measure efficiency (convergence speed and inference latency). These holistic benchmarks reveal critical trade-offs. For instance, some lightweight CNN models excel in speed, while transformer models boost boundary F-1 score. By capturing such nuances, our benchmark informs which architectures are best suited as general-purpose base models. It highlights how certain encoder–decoder combinations optimally balance performance and efficiency, and flags architectures with high transfer-readiness for new tasks and domains.
The result is a reproducible, transferable model landscape that serves as a blueprint for FM development. Our benchmark framework effectively preconditions the FM pipeline, enabling researchers to enter the scaling phase with proven architecture candidates that have demonstrated cross-task and cross-sensor robustness. This “model landscape” allows subsequent large-scale pretraining to confidently build on architectures that ensure broad downstream generalization even in agentic (autonomous) deployment scenarios.
Finally, we situate this work within the broader trend toward sensor-agnostic, self-supervised FMs in Earth observation. We argue that intelligent architecture search must precede any massive self-supervised pretraining effort. Early vetting of architectures under diverse conditions ensures that large-scale training resources are invested in the most promising designs. In summary, we frame this hybrid benchmarking framework as a strategic new layer in the geospatial FM ecosystem. The insights extend beyond segmentation, providing a reference point for building fine-tunable, sensor-agnostic foundation models that can be readily adapted to various downstream tasks and even deployed onboard satellites or other edge platforms. By solidifying architecture evaluation as an essential step, this work makes a serious scientific and strategic contribution toward the next generation of Earth observation AI.

How to cite: Alizadeh Pirbasti, M., McArdle, G., and Akbari, V.: Segmentation Model Benchmarking: A Strategic Prerequisite for Robust Geospatial Foundation Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21700, https://doi.org/10.5194/egusphere-egu26-21700, 2026.

EGU26-1927 | ECS | Orals | NP4.2

Differentiable Atmospheric Modelling with SpeedyWeather.jl  

Maximilian Gelbrecht, Milan Klöwer, Brian Groenke, and Niklas Boers

The current generation of hybrid machine learning and physics-informed machine learning is often limited by the missing availability of comprehensive differentiable models: either strongly simplified models have to be used or machine learning (ML) can’t be integrated natively into process-based models and must be trained separately. Here, we present the ongoing development of SpeedyWeather.jl: A general circulation model that’s differentiable, GPU-capable and ready for ML simulations. SpeedyWeather.jl is a spectral atmospheric GCM with a primitive equation core on flexible grid implementations from Gaussian to HEALPix. It contains simple yet interactive representations of ocean, land and sea ice for coupled climate simulations. With a user interface made for modularity and interactivity, it’s ideally suited as a framework for hybrid atmospheric models. For example, new parameterizations can be defined without any lines of code for GPU or differentiability specifics, yet integrate seamlessly into those. We document the process to achieve differentiability of our model using the general purpose automatic differentiation library Enzyme, problems we encountered and solutions we found. We demonstrate the differentiability with a sensitivity analysis of our model, initial developments of data-driven parameterizations, and give an outlook on the development of differentiable Earth system models. 

How to cite: Gelbrecht, M., Klöwer, M., Groenke, B., and Boers, N.: Differentiable Atmospheric Modelling with SpeedyWeather.jl , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1927, https://doi.org/10.5194/egusphere-egu26-1927, 2026.

EGU26-2240 | Posters on site | NP4.2

Estimating Canopy Resistance Using Machine Learning and Analytical Approaches 

Cheng-I Hsieh, I-Hang Huang, and Chun-Te Lu

Canopy resistance is a key parameter in the Penman–Monteith (P–M) equation for calculating evapotranspiration (ET). In this study, we compared a machine learning algorithm–support vector machine (SVM) and an analytical solution (Todorovic, 1999) for estimating canopy resistances. Then, these estimated canopy resistances were applied to the P–M equation for estimating ET; as a benchmark, a constant (fixed) canopy resistance was also adopted for ET estimations. ET data were measured using the eddy-covariance method above three sites: a grassland (south Ireland), Cypress forest (north Taiwan), and Cryptomeria forest (central Taiwan) were used to test the accuracy of the above two methods. The observed canopy resistance was derived from rearranging the P–M equation. From the measurements, the average canopy resistances for the grassland, Cypress forest, and Cryptomeria forest were 163, 346, and 321 (s/m), respectively. Our results show that both methods tend to reproduce canopy resistances within a certain range of intervals. In general, the SVM model performs better, and the analytical solution systematically underestimates the canopy resistances and leads to an overestimation of evapotranspiration. It is found that the analytical solution is only suitable for low canopy resistance (less than 100 s/m) conditions.

How to cite: Hsieh, C.-I., Huang, I.-H., and Lu, C.-T.: Estimating Canopy Resistance Using Machine Learning and Analytical Approaches, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2240, https://doi.org/10.5194/egusphere-egu26-2240, 2026.

EGU26-2804 | ECS | Orals | NP4.2

AI-generated ensemble river flow forecasting: Using rollout and an additional noise input to build ensemble forecasts 

Karan Ruparell, Kieran Hunt, Hannah Cloke, Christel Prudhomme, Florian Pappenberger, and Matthew Chantry

Machine learning models have been used with success to produce accurate river discharge forecasts at multiple lead times. However, almost no research has been done to show if they are physically consistent across lead times. In the deterministic problem setting, where models output a single forecast with multiple leadtimes, these models are known to be mean-seeking, predicting the most likely river flow for each day, regardless of how likely the resulting trajectory is to occur. This is important for forecasters who need to look at the multi-day properties of a forecast, such as the accumulated flow or number of days over threshold. When each leadtime is described as an independent distribution, the model provides no insight into how to connect the uncertainties at each lead time, as an ensemble forecast would. In this paper, we show that temporal consistency in machine learning forecasts cannot be assumed, and develop two methods for enforcing temporal consistency, the Conditional-LSTM and Seeded-LSTM. Through this, we create ensemble forecasts that successfully predict temporal properties of the 10-day hydrographs. We find that by explicitly training the model to treat the prediction of previous lead times as truth, our model better predicts temporal properties of 10-day hydrographs than other standard methods. Our approach allows users to efficiently generate as many ensemble members as desired, and we use our results to highlight the important of developing temporally consistent ensembles.

How to cite: Ruparell, K., Hunt, K., Cloke, H., Prudhomme, C., Pappenberger, F., and Chantry, M.: AI-generated ensemble river flow forecasting: Using rollout and an additional noise input to build ensemble forecasts, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2804, https://doi.org/10.5194/egusphere-egu26-2804, 2026.

EGU26-3361 | Posters on site | NP4.2

Midlatitude Cyclone Intensity Biases in Machine Learning Weather Prediction Models 

Helen Dacre, Andrew Charlton-Perez, Simon Driscoll, Suzanne Gray, Ben Harvey, Natalie Harvey, Kevin Hodges, Kieran Hunt, and Ambrogio Volonte

Forecasting the location and intensity of strong winds associated with midlatitude cyclones remains a key challenge due to their substantial societal and environmental impacts. In this study, we conditionally evaluate the ability of numerical weather prediction (NWP) models and machine learning weather prediction (MLWP) models to represent wind structures linked to these cyclones. Using a feature‑based tracking approach applied to a large sample of Northern Hemisphere cyclone events, we compare how different modelling frameworks capture cyclone evolution, including track, intensity, and near‑surface wind characteristics.

Our analysis shows that MLWP models can reproduce broad aspects of cyclone behaviour, such as large‑scale track evolution, with skill comparable to established operational NWP forecasting systems at medium-range lead times. However, we also identify systematic differences in how these models represent cyclone intensity and associated wind extremes. In particular, MLWP models tend to underestimate key high‑impact features, such as minimum pressure and peak near‑surface winds, relative to dynamical NWP forecasts.

These findings highlight both the promise and current limitations of MLWP systems for predicting midlatitude cyclone hazards. Understanding these behaviours provides guidance for future model development and for the use of ML‑based forecasts in operational and risk‑focused applications.

 

How to cite: Dacre, H., Charlton-Perez, A., Driscoll, S., Gray, S., Harvey, B., Harvey, N., Hodges, K., Hunt, K., and Volonte, A.: Midlatitude Cyclone Intensity Biases in Machine Learning Weather Prediction Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3361, https://doi.org/10.5194/egusphere-egu26-3361, 2026.

EGU26-4279 | Orals | NP4.2

Online learning of subgrid-scale models for quasi-geostrophic turbulence in planetary interiors 

Alexandre Fournier, Hugo Frezat, and Thomas Gastine

The use of machine learning to represent small-scale processes, such as subgrid-scale (SGS) dynamics, is now well established in weather forecasting and climate modelling. Recent advances have demonstrated that SGS models trained via "online" end-to-end learning - where the dynamical solver operating on the filtered equations participates in the training - can outperform traditional physics-based approaches. However, most studies have focused on idealised periodic domains or spheres, neglecting mechanical boundaries present in systems such as planetary interiors. To address this issue, we introduce a pseudo-spectral differentiable solver for the study of two-dimensional quasi-geostrophic turbulence in a rapidly rotating, axially symmetric bounded domain. A key advantage of the online learning approach is its implicit correction of the commutation errors arising from the irregular Chebyshev grid used in the radial direction, achieved through the estimation of correction terms for the filtered equations. In addition, since Chebyshev polynomials are not boundary-preserving, we project training data extracted from the high-resolution direct numerical simulation (DNS) from the fine grid onto the coarse grid using a Galerkin approach that ensures compatibility with the boundary conditions. 

We examine three configurations, varying the geometry (between an exponential container and a spherical shell) and the rotation rate. The flow is driven by a prescribed analytical forcing that mimics a network of pumps, allowing precise control over the energy injection scale and an exact estimate of the power input. For each case, we evaluate the accuracy of the online-trained SGS model against the reference DNS using integral quantities and spectral diagnostics. In all configurations, we show that an SGS model trained on data spanning only one turnover time remains stable and accurate over integrations at least a hundred times longer than the training period. Moreover, we demonstrate the model's remarkable ability to reproduce slow processes occurring on time scales far exceeding the training duration, such as the inward drift of jets in the spherical shell geometry, which exhibits a quasi-periodic recurrence time of O(10) turnover times. These results suggest a promising path towards developing SGS models for planetary and stellar interior dynamics, including dynamo processes. They indicate that costly DNS may need to be run only for short durations to generate training data, enabling subsequent long-term simulations with the trained model at a negligible computational cost.

 

How to cite: Fournier, A., Frezat, H., and Gastine, T.: Online learning of subgrid-scale models for quasi-geostrophic turbulence in planetary interiors, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4279, https://doi.org/10.5194/egusphere-egu26-4279, 2026.

EGU26-4536 | Orals | NP4.2 | Highlight

Aardvark weather: end-to-end data-driven weather forecasting 

Richard Turner

Weather forecasting is critical for a range of human activities including transportation, agriculture, industry, as well as the safety of the general public. Over the last two years, machine learning models have shown that they have the potential to transform the complex weather prediction pipeline, but current approaches still rely on numerical weather prediction (NWP) systems, limiting forecast speed and accuracy. In this talk, I will give some of the background on these developments. I will then introduce a machine learning model which can replace the entire operational NWP pipeline. Aardvark Weather, an end-to-end data-driven weather prediction system, ingests raw observations and outputs global gridded forecasts and local station forecasts. Further, it can be optimised end-to-end to maximise performance over quantities of interest. I will show that the system outperforms an operational NWP baseline for multiple variables and lead times for gridded and station forecasts. These forecasts are produced with a remarkably simple neural process model using just 8% of the input data and three orders of magnitude less compute than existing NWP and hybrid AI-NWP methods. We anticipate that Aardvark Weather will be the starting point for a new generation of end-to-end machine learning models for medium-range forecasting.

How to cite: Turner, R.: Aardvark weather: end-to-end data-driven weather forecasting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4536, https://doi.org/10.5194/egusphere-egu26-4536, 2026.

The inherent limitations of individual geophysical methods and the sparsity of observational data often render inversion results unstable and non-unique. Joint inversion of multiphysics data exploits the complementary sensitivities of different physical fields regarding depth, resolution, and boundary features, thereby significantly mitigating the ambiguity of single-method inversion and enhancing interpretation reliability. Traditional joint inversion approaches primarily fall into two categories: spatial structure-based and physical parameter-based constraints. The former relies on the similarity of property distribution patterns, which struggles to decouple non-homologous anomalies, while the latter is often constrained by the unreliability of empirical relationships under complex geological conditions. Recently, deep learning methods based on the U-Net architecture have achieved joint inversion by establishing constraints based solely on spatial structural similarity (Hu et al., 2025) or physical parameter correlations (Guo et al., 2021). Although promising, these methods often fail to accurately characterize non-homologous anomalies in complex geological environments.

This study proposes a dual-stream 3D U-Net architecture incorporating a hybrid attention-gating mechanism. In terms of methodology, we first construct a training dataset based on rock physics data that encompasses both statistical correlations and structural discrepancies. Regarding the network architecture, independent encoders are employed to extract 3D features from gravity and magnetic data, respectively. A cross-attention module is then utilized to capture deep structural correlations, thereby enhancing cooperative inversion in homologous regions. Subsequently, a gated fusion module is introduced as an adaptive feature selector to effectively disentangle inconsistent features in non-homologous regions. Finally, the prediction models are generated through independent decoders.

During the joint inversion implementation phase, the network takes preliminary independent inversion results as input to predict high-fidelity models that integrate physical and geological priors. We incorporate these predicted models as reference models into the regularization term of the joint inversion objective function, constructing a deep-prior-based constraint. During iterative optimization, this constraint guides the inversion trajectory toward the fine geological structures predicted by deep learning by minimizing the discrepancy between the inverted and reference models, while ensuring the fit to observational data. This mechanism achieves an organic integration of data-driven and physics-driven approaches.

References

  • Hu, Y. Su, X. Wu, Y. Huang and J. Chen, "Successive Deep Perceptual Constraints for Multiphysics Joint Inversion," in IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1-14, 2025, Art no. 5907114. 
  • Guo, H. M. Yao, M. Li, M. K. P. Ng, L. Jiang and A. Abubakar, "Joint Inversion of Audio-Magnetotelluric and Seismic Travel Time Data With Deep Learning Constraint," in IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 9, pp. 7982-7995, Sept. 2021.

How to cite: Xi, B. and wang, Z.: A Hybrid Attention-Gating Deep Learning Framework for 3D Joint Inversion of Gravity and Magnetic Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5241, https://doi.org/10.5194/egusphere-egu26-5241, 2026.

EGU26-5888 | Orals | NP4.2

Quantum machine learning-based parametrization for boundary layer turbulence 

Lena Dogra, Janis Klamt, Veronika Eyring, and Mierk Schwabe

In light of the urgent need to accelerate measures for the adaptation to and mitigation of climate change, accurate Earth system models are more important than ever, for technology assessment and the identification of the most effective climate protection strategies. Global climate models have successfully projected consequences of different future scenarios, but the spread in projections remains large, with subgrid-scale parametrizations being the main origin of these uncertainties. Recently, machine learning-based hybrid models have successfully enhanced parametrizations - their directly data-driven structure can more effectively capture the empirical aspects of the parametrizations. Especially for the more complex parametrizations, such as microphysics or turbulence, which we study here, quantum computing could bring decisive further improvements as a part of hybrid models. Atmospheric turbulence strongly affects weather and climate because it determines the rates of exchange of heat, moisture, and momentum between the earth surface and the atmosphere. However, due to the chaotic nature of turbulence and the wide range of turbulent regimes in the atmospheric boundary layer from deep convection to nearly laminar stable conditions, it is notoriously hard to predict and model.
Here, we develop a prototype of a quantum machine learning-based subgrid-scale parametrization for the vertical temperature flux caused by atmospheric turbulence based on semi-idealized Large-Eddy-Simulations. We run experiments with dry convective boundary layers with the PALM model system. The setups span an 8x8 km2 domain with a resolution of 10 m and horizontal periodic boundary conditions and an imposed surface heat flux, combining runs with different surface heat fluxes and geostrophic winds in our training data set. We train quantum and classical neural networks with different architectures, and find that quantum models based on parametrized circuits with just 2 or 3 qubits achieve accuracies similar to classical models with the same number of trainable parameters, highlighting the possibility to use quantum computing for parametrizations in the near future. In contrast, the Smagorinsky closure deviates strongly from the true flux in this setup. Our quantum and classical cell-based models both generalize well to data from PALM runs with unseen parameters close to the seen range. We further analyze the feature importance in quantum and classical models and find that most of our quantum models show better stability of the Shapley values with respect to varying the random initial conditions of the training runs. Since the number of required qubits to capture the idealized setting is low, it is promising to extend our model to more complex settings with realistic topography and varied weather conditions in the future, e.g. by using ICON boundary conditions in PALM, opening the possibility to exploit quantum advantages anticipated by the more stable interpretability of our prototype models.

How to cite: Dogra, L., Klamt, J., Eyring, V., and Schwabe, M.: Quantum machine learning-based parametrization for boundary layer turbulence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5888, https://doi.org/10.5194/egusphere-egu26-5888, 2026.

EGU26-6079 | Posters on site | NP4.2

Emulating transient climate simulations with generative AI  

Kirien Whan, Karin van der Wiel, and Nikolaj Mücke

Global climate models (GCMs), like KNMI’s EC-Earth, are an important tool to study the global climate system, and to understand how the climate responds to changes in external forcing. Large ensembles of climate simulations are necessary to separate the forced response from fluctuations due to the climate system’s internal variability (Maher et al., 2021; Muntjewerf et al, 2023). GCMs are computationally very expensive to run, particularly as they move towards the km-scale, which makes generating large ensembles very expensive. 

The generative modelling framework allows the transformation of a base distribution to the target distribution and easily facilitates the construction of large ensembles. We compare two generative models: 1) “stochastic interpolants”, that learn a pseudo-time dependent stochastic process that directly interpolates between the current state and the conditional target state of interest, and 2) a “flow matching” model, that learns a pseudo-time dependent deterministic process, conditioned on the current state, between a Gaussian distribution and the target state of interest.  Both models use a PDE-transformer backbone (Holzschuh et al, 2025). 

We train an emulator to predict global 2m-temperature at time t+1 using the previous 5 days of temperature, the annual global mean temperature and some static spatial and temporal features as conditioning inputs. We make predictions auto-regressively, feeding each prediction back into the model to generate sequences of arbitrary length at inference time. We use Large Ensembles from the EC-Earth3 model, for which a transient 16-member (1950-2166) ensemble and two 160-member time slices (2000-2009, 2075-2085) are available (Muntjewerf et al., 2023). The training dataset consists of up to 5 transient members and we use a single member for validation during training. We use another member for inference to produce an ensemble of global temperature simulations.  

The flow matching model successfully generates a stable ensemble of temperature fields that simulates the long-term forced trend, interannual variability, and spatial patterns of (global) temperature similarly to the GCM. 

 

 

References: 

Maher, N., Milinski, S. and Ludwig, R., 2021. Large ensemble climate model simulations: introduction, overview, and future prospects for utilising multiple types of large ensemble. Earth System Dynamics, 12(2), pp.401-418. 

Muntjewerf, L., Bintanja, R., Reerink, T. and Van Der Wiel, K., 2023. The KNMI Large Ensemble Time Slice (KNMI–LENTIS), Geosci. Model Dev. 16 4581–4597. doi: 10.5194. 

Holzschuh, B., Liu, Q., Kohl, G., & Thuerey, N. (2025). PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations. arXiv preprint arXiv:2505.24717. 

How to cite: Whan, K., van der Wiel, K., and Mücke, N.: Emulating transient climate simulations with generative AI , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6079, https://doi.org/10.5194/egusphere-egu26-6079, 2026.

EGU26-6950 | ECS | Orals | NP4.2

Improving AIFS Forecast Skill through Fine-Tuning across Spatial Resolutions and Datasets 

Gabriel Moldovan, Ana Prieto Nemesio, Ewan Pinnington, Simon Lang, Jan Polster, Cathal O'Brien, Mario Santa Cruz, Mihai Alexe, Harrison Cook, Richard Forbes, and Matthew Chantry

Over the past two years, ECMWF has rapidly developed and operationalised two machine-learned forecasting systems: AIFS Single, a deterministic model, and AIFS-ENS, a fully probabilistic forecasting system. Both systems are trained on ERA5 reanalysis data and further fine-tuned using operational IFS analyses. In this talk, we briefly introduce the AIFS framework and present ongoing research aimed at further improving its forecast skill.

Current efforts are driven by several research directions, including increasing spatial resolution and incorporating observational data. The current AIFS models operate at the native ERA5 resolution of approximately 30km. While higher resolutions could significantly improve forecast skill in surface variables, available datasets, such as the operational IFS analysis at 9km, are only available for a limited number of years. To address this, we explore a cross-resolution fine-tuning strategy in which AIFS is first pretrained on ERA5 at coarse resolution and subsequently fine-tuned on six years of recent operational IFS analyses at 9 km. We present promising early results showing that this approach enables stable fine-tuning down to 9 km and leads to significant gains in surface forecast skill.

A second research direction investigates the use of alternative datasets to improve total precipitation forecasts. ERA5 is known to exhibit deficiencies in the representation of precipitation, particularly in the tropics. We therefore fine-tune AIFS using the Integrated Multi-satellitE Retrievals for GPM (IMERG) dataset, which has been shown to better capture precipitation characteristics in this region. Early results indicate that incorporating IMERG data can significantly improve total precipitation forecast skill in AIFS, with the largest benefits observed, as expected, in tropical regions.

How to cite: Moldovan, G., Prieto Nemesio, A., Pinnington, E., Lang, S., Polster, J., O'Brien, C., Santa Cruz, M., Alexe, M., Cook, H., Forbes, R., and Chantry, M.: Improving AIFS Forecast Skill through Fine-Tuning across Spatial Resolutions and Datasets, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6950, https://doi.org/10.5194/egusphere-egu26-6950, 2026.

Machine learning is increasingly used to analyze, predict, and interpret Earth-system behavior. Here we synthesize AI4OCEANS research to identify practical, transferable lessons for developing ML methods that remain robust when applied to real Earth-system data and are evaluated across regions, scales, and event types. We present methodological advances and common pitfalls encountered when building ML workflows for prediction and diagnosis across oceanic and atmospheric contexts. Emphasis is placed on (i) constructing physically meaningful predictors and representations that generalize beyond a single region or period, (ii) designing evaluation strategies that reflect scientific and decision-relevant objectives (including event- and regime-aware metrics where appropriate), and (iii) quantifying uncertainty and interpretability in ways that support scientific insight rather than purely empirical skill. We further discuss when hybrid strategies—combining statistical learning with physical constraints or dynamical context—improve robustness in specific applications. By framing diverse studies through shared methodological questions across geophysical systems (from coastal ocean change through high-impact atmospheric events and into bycatch threats to marine wildlife), the produced frameworks provide guidance for ML development that is directly relevant to Earth-system modelling and prediction, particularly for variability, extremes, and environmental risks and impacts under anthropogenic influences.

How to cite: Nieves, V.: Transferable Machine-Learning Practices for Earth-System Prediction and Diagnosis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7654, https://doi.org/10.5194/egusphere-egu26-7654, 2026.

EGU26-8225 | ECS | Orals | NP4.2

Parameterised PINNs for water infiltration in unsaturated soils 

Mohamed Gowely and Anil Yildiz

Modelling water infiltration in unsaturated soils is vital for maintaining a healthy ecosystem, analysing the stability of slopes, or promoting sustainable agriculture. Recently, Physics-informed Neural Networks (PINNs) have gained popularity in solving highly nonlinear problems like the Richardson-Richards equation (RRE), by approximating physical laws with a loss term in a mesh-free approach, often using sparse data points, to mimic the gap spacing between field sensors. However, despite several successful applications in modelling 1D infiltration problems, the generalisation capability of these models is often limited by the specific scenarios used during training. Therefore, potential of the neural networks as universal approximators are not exploited in such applications. This paper investigates the feasibility of applying a Parameterised-PINNs (P-PINNs) as a surrogate model to solve the RRE. The model was trained only once across a range of infiltration conditions defined by varying soil hydraulic properties and meteorological conditions to evaluate its ability to predict various scenarios within the multidimensional parameter space without additional observation data. Results show that a wider rather than a deeper network architecture, enhanced by dynamic adaptive techniques, such as time-stratified Residual-based Adaptive Refinement (RAR), Layer-wise Locally Adaptive Activation Function (L-LAAF), and Principled Loss Function (PLF), aids in capturing the correct physical profile. Although the model achieved high overall performance when validated against analytical solutions, Nash-Sutcliffe Efficiency (NSE) > 0.99, it exhibited very minor phase errors. P-PINN was tested across drastically changing parameters, e.g. soils with very high or very low air-entry values, and satisfactory validation metrics were obtained. Our implementation P-PINNs demonstrate the potential as a universal non-linear approximator for such problems, where the initial computational cost of training is offset by the instant large-scale evaluations.

How to cite: Gowely, M. and Yildiz, A.: Parameterised PINNs for water infiltration in unsaturated soils, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8225, https://doi.org/10.5194/egusphere-egu26-8225, 2026.

EGU26-8476 | ECS | Orals | NP4.2

CTM-Assisted Generative AI Framework for Satellite-to-Surface Estimation of Ground-Level Air Pollutants 

Hyeonseo Kim, Eunhye Kim, Yoon-Hee Kang, Seongeun Jeong, Soontae Kim, Hyun Cheol Kim, and Rackhun Son

 Accurate monitoring of ground-level air pollutants is essential for exposure assessment and air quality management, but conventional modeling approaches exhibit significant limitations. Chemical Transport Models (CTMs) are computationally intensive and prone to systematic bias, while data-driven models often lack physical consistency and poorly represent long-range transport. To address these limitations, we present a novel hybrid modeling framework with three key innovations. First, satellite retrievals are employed as primary predictors rather than CTM outputs, thereby reducing computational demands. Second, a dual-target learning strategy prioritizes satellite-to-surface relationships, while CTM outputs are incorporated as soft physical constraints in data-sparse regions. Third, a generative diffusion model is integrated to improve the representation of long-range pollutant transport. Focusing on nitrogen dioxide (NO2), the completed framework achieves superior daily predictive accuracy (R2 = 0.72, RMSE = 3.70 ppb), outperforming precursor models. Its successful extension to sulfur dioxide (SO2) and fine particulate matter (PM2.5) demonstrates broad applicability. This study provides a physically informed and computationally efficient solution for scalable generation of high-fidelity, spatially continuous ground-level air quality fields.

This work was funded by the Korea Meteorological Administration Research and Development Program under Grant RS-2024-00404042 and the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2024-00343921).

How to cite: Kim, H., Kim, E., Kang, Y.-H., Jeong, S., Kim, S., Kim, H. C., and Son, R.: CTM-Assisted Generative AI Framework for Satellite-to-Surface Estimation of Ground-Level Air Pollutants, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8476, https://doi.org/10.5194/egusphere-egu26-8476, 2026.

EGU26-9777 | ECS | Orals | NP4.2

Mapping Ecosystems in the Peruvian Andes Using Hyperspectral Imagery and Machine Learning 

Daria-Ioana Radu, Hugo Lepage, Eustace Barnes, and Crispin Barnes

Mapping the Peruvian Andes has high ecological value because its ecosystems are immensely diverse. These mountains shelter numerous endemic species that could be protected if informed decisions are made when delineating conservation zones. Rigorous analysis of high-altitude regions traditionally requires multiple field visits, which place a financial burden on research teams. Such visits can pose safety risks, as several remote areas are difficult to access on foot due to the steep gradients, cloud cover, and logistical limitations.

Recent advances in satellite missions and machine learning (ML) allow land-cover features to be characterised with fewer ground-truthing expeditions, by utilising patterns present in large imagery datasets. However, the Andes remain challenging to map, because of the spectral similarity among some land-use and land-cover (LULC) classes and because steep gradients can lead to geometric distortions in the recorded images. 

This study highlights an easy-to-use method for generating LULC map prototypes for high-altitude Andean regions using EnMAP and EMIT hyperspectral imagery (HSI). Machine learning algorithms (e.g., K-means clustering, principal component analysis) were applied to the HSI to generate clusters and extract features with high discriminant power among LULC types. Expert interpretation allowed pairing the obtained clusters with suitable ecosystem labels, producing prototype LULC maps.

How to cite: Radu, D.-I., Lepage, H., Barnes, E., and Barnes, C.: Mapping Ecosystems in the Peruvian Andes Using Hyperspectral Imagery and Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9777, https://doi.org/10.5194/egusphere-egu26-9777, 2026.

EGU26-11323 | ECS | Posters on site | NP4.2

Disentangled and interpretable feature extraction of the Earth electron radiation belt: a first step towards the development of a reduced order model 

Gautier Nguyen, Antoine Brunet, Maria Tahtouh, Guillerme Bernoux, Nourallah Dahmen, and Ingmar Sandberg

The radiation belts are populations of energetic particles, such as electrons and protons, trapped in the near-Earth space vicinity by the geomagnetic field. Because they cover the great majority of existing orbits and because the particles’ dynamics, highly coupled with solar activity, can strongly affect spacecraft components and mission, the accurate modeling of these regions is of uttermost importance for the monitoring of the near-Earth space dynamics.

Traditionally, the radiation belts are modeled by solving a three‑dimensional diffusion equation with numerical solvers. While a single 3D simulation can easily be run in real time, as done routinely in many space weather forecasting pipelines, the computational burden can become significant when the model is used in ensemble‑based data assimilation that potentially requires hundreds of runs, over very long periods, such as those needed for space‑climate studies.

Within this context, machine learning based Reduced Order models (ROMs) offer an interesting solution to approach the solutions of traditional high-fidelity physics-based models with a reasonable accuracy and at a reduced computational cost. This is achieved by projecting project highly non-linear features onto a disentangled, interpretable latent space of reduced dimension which dynamics could be driven by external variables.

In this work, we take a first step towards the development of a ROM for the Earth electron radiation belts. Using a Distance regularized Siamese twin autoencoder (DIRESA) on long-term simulations we manage to reduce electron fluxes on a refined grid to a small subset of latent variables. We then show that these variables that can all be linked with external geomagnetic parameters. This allows them to be at the core of a ROM of the Earth electron radiation belts driven by those external parameters.

This work was supported by both the "Event-Based Electron Belt Radiation Storm Environments Modelling" Activity led by the Space Applications & Research Consultancy (SPARC) under ESA Contract 4000141351/23/UK/EG and ONERA internal fundings, through the federated research project PRF-FIRSTS.

How to cite: Nguyen, G., Brunet, A., Tahtouh, M., Bernoux, G., Dahmen, N., and Sandberg, I.: Disentangled and interpretable feature extraction of the Earth electron radiation belt: a first step towards the development of a reduced order model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11323, https://doi.org/10.5194/egusphere-egu26-11323, 2026.

EGU26-11736 | ECS | Orals | NP4.2

Towards a mass-conservative global sea ice emulator that generalizes across climates 

William Gregory, Mitchell Bushuk, James Duncan, Elynn Wu, Adam Subel, Spencer Clark, Jeremy McGibbon, Brian Henn, Troy Arcomano, W. Andre Perkins, Anna Kwa, Oliver Watt-Meyer, Alistair Adcroft, Chris Bretheron, and Laure Zanna

We introduce FloeNet, a data-driven emulator architecture trained on the Geophysical Fluid Dynamics Laboratory (GFDL) global sea ice model, SIS2. FloeNet is an auto-regressive graph neural network (GNN) which marks a step forward in sea ice emulation as the first model to dynamically evolve the state of sea ice and snow-on-sea-ice by mass and area budget decompositions. Specifically, FloeNet receives mechanical and thermodynamic forcing inputs from the atmosphere and ocean, and predicts ice and snow mass tendencies due to growth, melt, and advection. This yields a mass-conservative and interpretable model, as timestep-to-timestep changes in sea ice area and mass can now be attributed to each term in their respective budget.

Sea ice is often seen as a barometer for climate change. It is therefore crucial that data-driven sea ice models show an accurate response to different climate forcings. To this end, we show how FloeNet successfully reproduces sea ice trends and variability of pre-industrial and 1% CO2 climates, despite being trained only on a present-day climate; FloeNet also reaches globally ice-free conditions under 1% CO2 forcing, with consistent timing to that of the original numerical model. In summary, FloeNet is a fast global sea ice emulator, taking 4.75 hours to generate a 140-year simulation on 1 GPU. It is also stable and accurate, reproducing critical features of long-term sea ice evolution under different forcings. We expect that FloeNet will substantially improve the representation of atmosphere-ice-ocean interactions in existing climate emulators.

How to cite: Gregory, W., Bushuk, M., Duncan, J., Wu, E., Subel, A., Clark, S., McGibbon, J., Henn, B., Arcomano, T., Perkins, W. A., Kwa, A., Watt-Meyer, O., Adcroft, A., Bretheron, C., and Zanna, L.: Towards a mass-conservative global sea ice emulator that generalizes across climates, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11736, https://doi.org/10.5194/egusphere-egu26-11736, 2026.

EGU26-11847 | ECS | Posters on site | NP4.2

Measurement-Constrained Reduced-Order Surrogates for Flexible-Mesh Coastal Ocean Models 

Melissa Ulsøe Jessen, Jesper Sandvig Mariegaard, and Freja Høgholm Petersen

Reduced-order surrogate models based on Koopman autoencoders have recently shown strong potential for accelerating flexible-mesh coastal ocean simulations while maintaining physically meaningful dynamics. In this contribution, we extend a previously validated Koopman autoencoder framework by explicitly incorporating information from in-situ measurements during training.

The proposed approach augments the surrogate training objective with measurement-based constraints, penalizing deviations from observed water surface elevations at selected locations and times. This enables the surrogate to remain consistent with sparse observations while preserving the learned large-scale dynamical structure driven by meteorological forcing and boundary conditions.

The method is evaluated on two realistic MIKE 21 HD coastal-ocean configurations published as open WaterBench datasets: the Southern North Sea and the Øresund Strait. Performance is assessed against both full physics-based simulations and independent in-situ observations, focusing on accuracy, temporal stability, and generalization beyond the training period.

Results demonstrate that measurement-constrained training can reduce local prediction errors near observation points without degrading global performance, while retaining the substantial inference speed-ups characteristic of Koopman-based reduced-order models. The proposed framework represents a step toward tighter integration of observations and machine-learning surrogates for efficient, observation-aware coastal ocean modelling, with relevance for ensemble forecasting and long-term scenario analysis.

How to cite: Jessen, M. U., Mariegaard, J. S., and Petersen, F. H.: Measurement-Constrained Reduced-Order Surrogates for Flexible-Mesh Coastal Ocean Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11847, https://doi.org/10.5194/egusphere-egu26-11847, 2026.

EGU26-12000 | ECS | Posters on site | NP4.2

Representing Subgrid-Scale Cloud Effects in a Radiation Parameterization using Machine Learning 

Katharina Hafner, Sara Shamekh, Guillaume Bertoli, Axel Lauer, Robert Pincus, Julien Savre, and Veronika Eyring

Improvements of Machine Learning (ML)-based radiation emulators remain constrained by the underlying assumptions to represent horizontal and vertical subgrid-scale cloud distributions, which continue to introduce substantial uncertainties. In this study, we introduce a method to represent the impact of subgrid-scale clouds by applying ML to learn processes from high-resolution model output with a horizontal grid spacing of 5km. In global storm resolving models, clouds begin to be explicitly resolved. Coarse-graining these high-resolution simulations to the resolution of coarser Earth System Models yields radiative heating rates that implicitly include subgrid-scale cloud effects, without assumptions about their horizontal or vertical distributions. We define the cloud radiative impact as the difference between all-sky and clear-sky radiative fluxes, and train the ML component solely on this cloud-induced contribution to heating rates. The clear-sky tendencies remain being computed with a conventional physics-based radiation scheme. This hybrid design enhances generalization, since the machine-learned part addresses only subgrid-scale cloud effects, while the clear-sky component remains responsive to changes in greenhouse gas or aerosol concentrations. Applied to coarse-grained data offline, the ML-enhanced radiation scheme reduces errors by a factor of up to 4-10 compared with a conventional coarse-scale radiation scheme. We observe improved radiative heating rates across several cloud regimes and regions, including precipitating and non-precipitating clouds and stratocumulus regions. This shows the potential of representing subgrid-scale cloud effects in radiation schemes with ML for the next generation of Earth System Models.

How to cite: Hafner, K., Shamekh, S., Bertoli, G., Lauer, A., Pincus, R., Savre, J., and Eyring, V.: Representing Subgrid-Scale Cloud Effects in a Radiation Parameterization using Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12000, https://doi.org/10.5194/egusphere-egu26-12000, 2026.

EGU26-12760 | ECS | Posters on site | NP4.2

Evaluating the forecast skill of machine-learning weather prediction models across a selection of extreme UK windstorms 

James Hewitt, Ambrogio Volonté, Ben Harvey, Andressa Andrade Cardoso, Kieran Hunt, Natalie Harvey, Oscar Martinez-Alvarado, Suzanne Gray, Helen Dacre, and Kevin Hodges

While numerical weather prediction (NWP) underpins existing early warning systems, its high computational cost limits scalability. Machine-learning weather prediction (MLWP) offers a promising alternative, yet its skill and reliability at forecasting wind extremes and small-scale storm features across different storms remain uncertain. Evaluating the forecast skill of MLWP models across a range of storms is therefore critical before MLWP can be integrated safely into early warning systems.

 

This study evaluates the performance of eight leading MLWP models at forecasting the peak 10 m and 850 hPa wind speeds, pressure minima, and relative vorticity associated with the most damaging UK windstorms from the 2023/24 winter season: Babet, Ciarán, Debi, Gerrit, Henk and Isha. MLWP models are evaluated against ERA5 and IFS analysis and benchmarked against the NWP IFS ensemble forecast. The results reveal substantial variability in MLWP forecasting skill both between storms and across models.

 

MLWP forecast skill is found to be linked to the horizontal scale and dynamical nature of the storm feature producing the strongest winds. While wind maxima associated with large-scale conveyor-belt airstreams are generally well predicted, those arising from smaller-scale features, including the cold conveyor belt and sting jets, are underestimated. MLWP model performance is also found to be variable between storms, with no clear best- or worst-performing model. The higher-resolution Aurora-0.1 model is not found to be better at forecasting wind extremes, despite the small spatial scale of the storm features producing the strongest winds in four of the storms analysed.

 

An in-depth, feature-based analysis is performed for Storms Henk and Isha. Henk proved challenging for both MLWP and NWP models to forecast, resulting in short-notice and inaccurate wind alerts from the Met Office. The MLWP models performed worst for Isha overall, despite the NWP models predicting it well. Across both storms, MLWP models struggled to predict small-scale features associated with extreme winds and tended to smooth sharp frontal gradients.

 

These results highlight critical limitations in existing MLWP models that make them unsuitable for replacing NWP as a primary forecasting tool for hazardous UK windstorms today. However, current MLWP models could provide rapid, low-cost ensemble information that complements traditional NWP outputs, or serve as a part of a hybrid ML-NWP approach, particularly if structural limitations in representing fine-scale wind maxima are acknowledged and mitigated.

How to cite: Hewitt, J., Volonté, A., Harvey, B., Andrade Cardoso, A., Hunt, K., Harvey, N., Martinez-Alvarado, O., Gray, S., Dacre, H., and Hodges, K.: Evaluating the forecast skill of machine-learning weather prediction models across a selection of extreme UK windstorms, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12760, https://doi.org/10.5194/egusphere-egu26-12760, 2026.

EGU26-13711 | ECS | Orals | NP4.2

Physics-informed, open-box neural network parameterization of moist physics 

Peter Ukkonen and Hannah Christensen

Machine learning hold the promise of unlocking more accurate and realistic parameterizations of atmospheric processes, but brings its own set of challenges and drawbacks. Among top issues are generalization, stability and interpretability. Here we present a parameter-efficient neural network parameterization which aims to address these issues by incorporating physical knowledge to a high degree. By predicting fluxes and microphysical process rates instead of total tendencies, the conservation of water can be hardcoded, which is shown to improve online performance. Furthermore, a physically motivated architecture based on vertically recurrent neural networks enables high computational efficiency and a low number of parameters. The models are trained and evaluated using a superparameterization setup with real orography. The impact of incorporating stochasticity is also discussed. 

How to cite: Ukkonen, P. and Christensen, H.: Physics-informed, open-box neural network parameterization of moist physics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13711, https://doi.org/10.5194/egusphere-egu26-13711, 2026.

EGU26-14578 | ECS | Orals | NP4.2

Machine Learning and Remote Sensing Projections for Peruvian Glaciers (2016–2100) 

Hugo Lepage, Darina Andriychenko Leonenko, Nina Elliott, and Crispin Barnes

The Peruvian Andes contain over 70% of the world's tropical glaciers, which are vital for regional water security and are rapidly destabilising due to climate change. Current large-scale projections often lack the spatial resolution required for localised glacial melt modelling or rely on climate reanalysis products that are too coarse in rugged terrain. This study introduces a unified framework that combines high-resolution remote sensing (Sentinel-2, Landsat-8) with machine learning to characterise, monitor, and forecast glacial evolution across Peru from 2016 to 2100.

We propose a machine-learning modelling approach that addresses both the where and when of glacial retreat. We developed a spatial Random Forest classifier to generate country-wide melt vulnerability maps. Ensemble analysis of driving parameters reveals that "distance-to-edge" and topographic factors (elevation, slope) are significantly stronger predictors of melt spatiality than available coarse-resolution temperature and precipitation datasets. Our spatial model achieves a 74.9% overlap accuracy between simulated and observed melt (2016–2023), nearly doubling the performance of benchmark Multi-Criteria Decision Analysis methods (39.3%).

Complementing this spatial analysis, we developed a temporal, area-based melt model from annual inventories of over 2,000 individual glacier polygons. Using a Huber regression to fit negative power laws to ablation rates, we identified a clear acceleration in retreat for smaller ice bodies, consistent with albedo-ice feedback mechanisms. Between 2016 and 2023, we observed a relative area loss of 15 ± 4% (180 ± 70 km2).

Integrating these models to forecast future scenarios, we project that only ~30% (26–43%) of the 2020 glacial surface area will remain by 2100, with several cordilleras facing near-total extinction. This workflow establishes a new standard for observation-based, scalable glacial modelling, providing the high-resolution spatial and temporal insights necessary for effective water resource management and adaptation strategies in the tropical Andes.

How to cite: Lepage, H., Andriychenko Leonenko, D., Elliott, N., and Barnes, C.: Machine Learning and Remote Sensing Projections for Peruvian Glaciers (2016–2100), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14578, https://doi.org/10.5194/egusphere-egu26-14578, 2026.

EGU26-14900 | ECS | Orals | NP4.2

A Hybrid FNO-Diffusion Framework for Uncertainty-Aware Source Energy Estimation in Atmospheric Waveguides 

Elodie Noëlé, Filippo Gatti, Didier Clouteau, Christophe Millet, and Fanny Lehmann

Estimating the source of acoustic waves propagating in a vertically stratified medium poses significant challenges due to the high variability of the acoustic fields at long-range distances caused by heterogeneous vertical sound speed profiles. This renders the problem an inverse and ill-posed one. To address this challenge, we propose a three-step approach utilizing a Bayesian framework. First, we show that using only the low-frequency components (up to 1.5 Hz) of the acoustic fields is sufficient to capture the source parameters. Second, we develop a fast surrogate forward model based on a Fourier Neural Operator (FNO) [1] to bypass the computational burden of traditional numerical solvers. Finally, we trained diffusion models to represent the complex prior [2] of the atmospheric profiles and to accurately estimate the posterior distribution [3] in the context of our inference problem. The models are trained on a database comprising over 20,000 simulations generated using a normal mode code [4]. Our results show that our FNO model achieves a relative least squares error of approximately 8%. The combined FNO and diffusion model framework [5] is demonstrated to yield more reliable energy estimates when compared to the utilization of the FNO framework alone.

[1] N. Perrone, F. Lehmann, H. Gabrielidis, S. Fresca, and F. Gatti, “Integrating Fourier Neural Operators with Diffusion Models to improve Spectral Representation of Synthetic Earthquake Ground Motion Response,” arXiv preprint arXiv:2504.00757, 2025. doi: 10.48550/arXiv.2504.00757

[2] F. Lehmann, F. Gatti, M. Bertin, and D. Clouteau, “3D elastic wave propagation with a Factorized Fourier Neural Operator (F-FNO),” Computer Methods in Applied Mechanics and Engineering, vol. 417, art. no. 116718, 2023. doi: 10.1016/j.cma.2023.116718

[3] F. Bergamin, C. Diaconu, A. Shysheya, P. Perdikaris, J. M. Hernández-Lobato, R. E. Turner, and E. Mathieu, “Guided Autoregressive Diffusion Models with Applications to PDE Simulation,” in ICLR 2024 Workshop on AI4DifferentialEquations In Science, 2024. 

[4] T. Karras, M. Aittala, S. Laine, and T. Aila, “Elucidating the design space of diffusion-based generative models,” in Proceedings of the 36th International Conference on Neural Information Processing Systems (NeurIPS), 2022, pp. 26565–26577. doi: 10.5555/3600270.3602196

[5] M. Bertin, C. Millet, and D. Bouche, “A low-order reduced model for the long range propagation of infrasound in the atmosphere,” The Journal of the Acoustical Society of America, vol. 136, no. 5, pp. 2693–2705, 2014. doi: 10.1121/1.4896776

How to cite: Noëlé, E., Gatti, F., Clouteau, D., Millet, C., and Lehmann, F.: A Hybrid FNO-Diffusion Framework for Uncertainty-Aware Source Energy Estimation in Atmospheric Waveguides, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14900, https://doi.org/10.5194/egusphere-egu26-14900, 2026.

High-resolution and temporally consistent satellite observations are essential for effectively monitoring, modeling, and mitigating environmental challenges. However, optically based remote sensing faces cross-sensor interoperability issues and is inherently affected by cloud contamination and atmospheric interference, resulting in temporal discontinuities that limit the availability of timely and uninterrupted observations. Existing approaches have primarily focused on retrospective gap-filling of missing data. In contrast, forecasting surface dynamics introduces additional challenges, particularly the need for high-fidelity and temporally continuous information to support near-real-time monitoring and predictive applications as the time since the last observation increases.

To address this challenge, we developed a physics-guided transformer framework trained on Harmonized Landsat, Sentinel-2, and PlanetScope (HLSP) data to forecast uninterrupted daily 30-m surface reflectance during periods with missing optical observations. HLSP is a radiometrically and geometrically harmonized multi-sensor optical dataset integrating Landsat 8–9, Sentinel-2, and PlanetScope imagery to provide sensor-agnostic, temporally consistent surface reflectance products. The model was trained using a multi-year (2017–2025) archive of HLSP surface reflectance imagery across eight agricultural regions in the United States, Brazil, France, Spain, Egypt, South Africa, Thailand, and China. Spectral features from daily HLSP data (30 m resolution) were combined with daily land surface temperature (LST) and soil water content (SWC) at 100-m resolution derived from passive microwave observations. Additional temporal covariates, including day-of-year encoded using sine and cosine transformations, were incorporated to explicitly represent seasonal and phenological timing and enable the network to capture key biophysical, hydroclimatic, and seasonal controls on surface reflectance dynamics.

The physics-guided framework constrains predictions using land–surface energy balance relationships linking surface reflectance, land surface temperature, and soil moisture. These constraints promote physically consistent interactions among surface variables while learning temporally coherent surface reflectance dynamics associated with vegetation growth, moisture persistence, and land–surface energy exchanges.

Model skill was evaluated using RMSE and MAE under a forward-looking temporal validation strategy, in which the model was trained on eight years of historical HLSP data and used to forecast surface reflectance over multiple lead times (2, 5, 10, 15, and 20 days) following the last available optical observation in the final year. Forecasts were validated against independently observed HLSP data for the corresponding periods, allowing assessment of skill degradation as forecast horizons increased. Results demonstrate that incorporating LST, SWC, NDVI, and time-related covariates substantially improves forecast stability and fidelity, particularly under variable climatic and land-cover conditions. The proposed approach provides a scalable and generalizable machine-learning framework for short-term forecasting of EO surface reflectance time series, with applications in climate-impact assessment, drought monitoring, evapotranspiration modeling, and carbon–water flux analysis.

How to cite: Alamdar, S. and Houborg, R.: Physics-Guided Transformer-based Forecasting of High-Resolution Earth Observation Surface Reflectance Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14919, https://doi.org/10.5194/egusphere-egu26-14919, 2026.

EGU26-14956 | ECS | Orals | NP4.2

Super-Resolving Any Place on Earth - Implicit Neural Representations for Sentinel-2 Time Series 

Sander Jyhne, Christian Igel, Morten Goodwin, Per-Arne Andersen, Serge Belongie, and Nico Lang

High-resolution imagery is limited by sensor technology, atmospheric effects, and acquisition costs. This is a well-known challenge in satellite remote sensing, but it also applies to ground-level imaging with handheld devices such as smartphones. Super-resolution seeks to overcome these limitations by enhancing image resolution algorithmically. Single-image super-resolution, however, is an ill-posed inverse problem and therefore depends on strong priors, typically learned from high-resolution training data or imposed through auxiliary information such as high-resolution guidance from another modality. While these methods often produce visually appealing results, they are prone to hallucinating structures that do not reflect the true scene content.

Multi-image super-resolution (MISR) addresses this issue by exploiting multiple low-resolution views of the same scene that are captured with sub-pixel shifts. In this work, we introduce SuperF, a test-time optimization approach for MISR based on coordinate-based neural networks, also known as neural fields. By representing images as continuous signals using implicit neural representations (INRs), neural fields are well suited for reconstructing high-resolution images from multiple aligned observations. The central idea of SuperF is to share a single INR across all low-resolution frames while jointly optimizing the image representation and the sub-pixel alignment between frames.

Compared to prior INR-based approaches adapted from burst fusion and layer separation, SuperF directly parameterizes the sub-pixel alignment using optimizable affine transformation parameters and performs the optimization on a super-sampled coordinate grid corresponding to the target output resolution. We evaluate the proposed method on simulated bursts of satellite imagery as well as on ground-level images captured with handheld cameras, and observe consistent improvements for upsampling factors of up to 8. A key advantage of SuperF is that it operates entirely at test time and does not rely on any high-resolution training data.

How to cite: Jyhne, S., Igel, C., Goodwin, M., Andersen, P.-A., Belongie, S., and Lang, N.: Super-Resolving Any Place on Earth - Implicit Neural Representations for Sentinel-2 Time Series, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14956, https://doi.org/10.5194/egusphere-egu26-14956, 2026.

The primary challenge in forecasting the Earth's magnetic field lies in capturing rapid, non-linear events like geomagnetic jerks. Conventional models relying solely on geomagnetic data often struggle to replicate the variation of those abrupt changes, such as the 2014–2015 geomagnetic jerk.

This study introduces a multiple data approach that simultaneously co-estimates geomagnetic snapshots and Length-of-Day (LOD) variations using a machine learning method. Specifically, we use an Extended Kalman Filter-trained Recurrent Neural Network (EKF-RNN; Sato et al., in press) to model the complex, non-linear dynamics of the Earth's core, including the geomagnetic jerks.

The training and validation datasets for our neural network were derived from the MCM geomagnetic field model (Ropp & Lesur, 2023), which is based on vector geomagnetic data from global magnetic observatories as well as the CHAMP and Swarm-A satellites (Ropp et al., 2020). To constrain the internal dynamics of the Earth’s core, we incorporated LOD data from the Earth Orientation Parameters series C04, provided by the International Earth Rotation and Reference Systems Service. The LOD dataset combines historical observations with modern space geodetic techniques including Very Long Baseline Interferometry, Satellite Laser Ranging, Global Navigation Satellite Systems and Lunar Laser Ranging, offering a continuous record from 1962 to present (Bizouard & Gambis, 2011).

After removing predictable tidal and atmospheric signals, LOD variations reflect exchanges of angular momentum between the Earth's core and mantle. Since electromagnetic waves such as torsional Alfvén waves generated in the Earth's core are linked to rapid geomagnetic accelerations, inclusion of LOD data may make a key constraint on the geomagnetic forecast. Our results show that a model trained only by geomagnetic secular acceleration (SA) failed to capture the 2014–2015 geomagnetic jerk, whereas adding LOD data showed an improved accuracy during the same event. Specifically, the SA misfit decreased from 4.98 to 2.43 nT/yr². The improvement was most significant when training with the second-order derivatives (i.e., SA snapshots themselves), indicating that the EKF-RNN successfully uncovered the underlying physical connection between geomagnetic acceleration and the Earth’s rotation. This study confirms that a multiple data approach, combining independent yet physically linked observation data, is essential for the next generation of geomagnetic forecast models.

How to cite: Toh, H., Sato, S., and Lesur, V.: Impact of Length-of-Day inclusion on geomagnetic secular variationforecast by a recurrent neural network, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15244, https://doi.org/10.5194/egusphere-egu26-15244, 2026.

EGU26-15271 | ECS | Orals | NP4.2

Storm-scale forecasting from observations 

Jaideep Pathak, Mohammad Shoaib Abbas, Peter Harrington, Zeyuan Hu, Noah Brenowitz, Suman Ravuri, Dale Durran, Corey Adams, Oliver Hennigh, Nicholas Geneva, Jussi Leinonen, Alberto Carpentieri, and Mike Pritchard

Accurate short-term prediction of clouds and precipitation is critical for severe weather warnings, aviation safety, and renewable energy operations. Traditional mesoscale numerical weather prediction models require significant modeling expertise and computational infrastructure. We introduce Stormscope, a family of transformer-based generative diffusion models trained directly on high-resolution, multi-band geostationary satellite imagery and ground-based radar over the Continental United States. Stormscope produces forecasts at a temporal resolution as high as 10 min and 6-km spatial resolution. Geostationary satellites and ground-based radar provide high-resolution, high-frequency observations essential for characterizing the evolving structure of the mesoscale atmosphere. Evaluated against extrapolation methods and operational mesoscale NWP models such as HRRR, Stormscope achieves leading performance on standard verification metrics including Fractions Skill Score and Continuous Ranked Probability Score across forecast horizons from 1 to 6 hours. By operating in native observation space, Stormscope establishes a new paradigm for AI-driven nowcasting with direct applicability to operational forecasting workflows. The approach is highly extensible, with demonstrated computational scaling to larger domains and higher resolutions. Critically, because Stormscope relies solely on globally ubiquitous satellite observations and radar where available, it offers a pathway to extend skillful mesoscale forecasting to oceanic regions and countries without existing strong operational mesoscale modeling programs.

How to cite: Pathak, J., Abbas, M. S., Harrington, P., Hu, Z., Brenowitz, N., Ravuri, S., Durran, D., Adams, C., Hennigh, O., Geneva, N., Leinonen, J., Carpentieri, A., and Pritchard, M.: Storm-scale forecasting from observations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15271, https://doi.org/10.5194/egusphere-egu26-15271, 2026.

EGU26-16098 | ECS | Orals | NP4.2

CondensNet: Self-adaptive physical constraints for stable long-term hybrid climate simulations 

Xin Wang, Gianmarco Mengaldo, Jianda Chen, Juntao Yang, Jeff Adie, Simon See, Kalli Furtado, Chen Chen, Troy Arcomano, Romit Maulik, and Wei Xue
Accurate and efficient climate simulations are crucial for understanding Earth’s evolving climate. However, current General Circulation Models (GCMs) face challenges in capturing unresolved physical processes, such as cloud and convection. A common solution is to adopt Cloud-Resolving Models (CRMs), which provide more accurate results than the standard subgrid parameterization schemes typically used in GCMs. However, CRMs (also referred to as super-parameterizations, such as SPCAM) remain computationally prohibitive. Hybrid modeling, which integrates deep learning with equation-based GCMs, offers a promising alternative but often struggles with long-term stability and accuracy issues.
 
In this work, we find that water vapor oversaturation during condensation is a key factor compromising the stability of hybrid modeling. To address this, we introduce CondensNet, a novel neural network architecture that embeds a self-adaptive physical constraint to correct unphysical condensation processes. CondensNet effectively mitigates water vapor oversaturation, enhancing simulation stability while maintaining accuracy and improving computational efficiency compared to super-parameterization schemes.
 
We integrate CondensNet into a GCM to form PCNN-GCM (Physics-Constrained Neural Network GCM), a hybrid deep learning framework designed for long-term stable climate simulations under real-world conditions (AMIP setting). PCNN-GCM enables stable simulations over decades and achieves up to 370× speed-up compared with SPCAM, while also being faster than traditional CAM5 under GPU acceleration or CPU-only. Beyond stability and efficiency, PCNN-GCM demonstrates greater skill in capturing complex climate variability than CAM5, including tropical precipitation extremes and the Madden-Julian Oscillation (MJO), yielding results that align more closely with observations or reanalyses (e.g., ERA5, TRMM) than conventional parameterization schemes.

How to cite: Wang, X., Mengaldo, G., Chen, J., Yang, J., Adie, J., See, S., Furtado, K., Chen, C., Arcomano, T., Maulik, R., and Xue, W.: CondensNet: Self-adaptive physical constraints for stable long-term hybrid climate simulations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16098, https://doi.org/10.5194/egusphere-egu26-16098, 2026.

EGU26-16240 | Posters on site | NP4.2

Inductive Biases for Robust Climate Emulation Across Forecast Timescales 

Oskar Bohn Lassen, Francisco Camara Pereira, Simon Driscoll, Sebastian Schemm, and Stephen Thomson

Machine-learning emulators have demonstrated remarkable skill for weather prediction and short-range forecasting, yet their behaviour, as forecasts extend toward seasonal and longer timescales, remains less well explored and understood. Approaching these horizons, forecast skill is shaped less by short-range error growth, while variations in background states or system parameters increasingly influence the evolving dynamics. Understanding if and how different neural architectures perform with such changes is therefore central to assessing their suitability for emulation beyond medium range weather prediction, where robustness plays an increasingly important role. In this work, we investigate how inductive biases encoded in deep-learning architectures influence their ability to represent and evolve dynamics as forecasts move into windows nearing and sometimes beyond their training data.

We use the idealised climate model ISCA as a controlled testbed, enabling systematic variation of planetary parameters and initial conditions while retaining a fixed underlying set of governing equations. Emulators are trained on ensembles of trajectories sampled from a restricted parameter range and evaluated under progressively more challenging ID/OOD settings. This framework allows us to disentangle errors arising from finite-horizon forecasting from those associated with longer-timescale dynamical shifts, providing insight into which architectural biases promote stability, physical consistency, and robustness as machine-learning models are pushed from shorter term prediction toward longer time scale emulations.

How to cite: Bohn Lassen, O., Camara Pereira, F., Driscoll, S., Schemm, S., and Thomson, S.: Inductive Biases for Robust Climate Emulation Across Forecast Timescales, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16240, https://doi.org/10.5194/egusphere-egu26-16240, 2026.

EGU26-16454 | ECS | Orals | NP4.2

Emulating climate variability and extremes with a multivariate flow matching model trained on CESM2 and finetuned on ERA5 

Violette Launeau, Mathieu Vrac, Léo Lemordant, and Pierre Gentine

In the context of increasing interest in machine learning-based emulators to overcome the computational cost and limited scenario coverage of Earth System Models (ESMs), some key challenges remain, such as capturing internal variability, handling non-stationarity, and realistically representing compound and extreme events — especially at high spatial and temporal resolution.

We present a probabilistic multivariate emulator of climate variables based on a flow matching model trained on the CESM2 Large Ensemble (Danabasoglu et al., 2020) under the SSP3-7.0 scenario. Our approach leverages a flow matching framework (Lipman et al., 2022) to reproduce the spatiotemporal variability of temperature and precipitations on a monthly timescale. The model is conditioned on greenhouse gas concentrations and we evaluate the capability of the model  to generate physically consistent climate fields and to capture the full ensemble spread of the original ESM, including tail behavior and potential extreme events. To ensure a better reproduction of observed climatological variability, the flow matching model is fine-tuned on ERA5 reanalyses (Hersbach et al., 2020). This should enable the emulator to act as a stochastic weather generator of plausible climate states under GHG forcing trajectories, accounting for the non-stationarity introduced by anthropogenic climate change, and allowing for the assessment of rare or compound extreme events within the generated ensemble. Our results assess the model’s ability to reproduce ensemble-scale statistics, cross-variable dependencies, and evolving climate distributions across time. 

How to cite: Launeau, V., Vrac, M., Lemordant, L., and Gentine, P.: Emulating climate variability and extremes with a multivariate flow matching model trained on CESM2 and finetuned on ERA5, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16454, https://doi.org/10.5194/egusphere-egu26-16454, 2026.

EGU26-18204 | ECS | Orals | NP4.2

Coupling ICON with a Machine Learning Emulator for Cloud Microphsysics 

Paul Keil, Caroline Arnold, and Shivani Sharma

As the spatial resolution of general circulation models (GCMs) increases and storms and clouds can be resolved, the underlying cloud microphysics still need to be parameterised. This is known to be a major source of uncertainty in climate and weather simulations. The established parameterisations use bulk moment schemes, where the conversion of cloud and rain droplets is approximated through empirical relationships. Particle-based superdroplet simulations would provide a more accurate representation but are typically not feasible for use in GCMs.

We couple SuperdropNet, an ML emulator for warm rain cloud microphysics trained on superdroplet simulations, to ICON. Previously, we validated the coupled model in an idealised cloud microphysics test case and showed that SuperdropNet runs stable and provides reasonable precipitation patterns.

Now we move towards a climate model experiment with 10 km horizontal resolution in an AMIP setup to investigate SuperdropNet’s feasibility and interaction with ICON in a realistic setting. Coupling SuperdropNet to ICON is achieved using FTorch. We are able to run ICON on 128 nodes on the CPU partition of the HPC system Levante with minimal overhead. Conditions beyond the training data range of SuperdropNet lead to negative feedback loops and impact the long-term stability of the coupled simulation. Therefore, we implement physics-based constraints that improve stability. Initial results show mean surface precipitation is very similar to using the bulk scheme approach. SuperdropNet simulates a faster cloud-to-rain transition which impacts cloud water mass and rain droplet size. This has consequences for the radiation budget and the frequency distribution of precipitation. Furthermore, we show that an autoregressive rollout of SuperdropNet that allows for longer GCM time steps runs stable and does not impact results. Finally, we test SuperdropNET’s generalisation capabilities in a 4K warmer world.

How to cite: Keil, P., Arnold, C., and Sharma, S.: Coupling ICON with a Machine Learning Emulator for Cloud Microphsysics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18204, https://doi.org/10.5194/egusphere-egu26-18204, 2026.

EGU26-18659 | ECS | Orals | NP4.2

Hierarchical Graph Networks for ForecastingTerrestrial Water Storage Anomalies 

Viola Steidl and Xiao Xiang Zhu

The availability of fresh water is vital to the ecosystem and communities. In a changing climate, the increased risk of droughts makes it crucial to have an accurate understanding of changes in terrestrial water storage (TWS). Predicting changes in TWS is inherently difficult since it integrates the changes of all water compartments, with underlying processes that operate on vastly different temporal and spatial scales. 

Forecasting tasks nowadays are often solved using machine learning models. However, these models require vast amounts of data. In contrast, total water storage anomalies (TWSA) derived from GRACE/GRACE-FO observations only date back to 2002 and are available at a grid of 1°x1° at monthly resolution. Nevertheless, Li et al., (2024) showed that machine-learning approaches could forecast TWSA tendencies for up to one year ahead. They cleverly exploit temporal lag relationships between TWSA and ocean, atmospheric, or land variables.

In our work, we explore a novel design of a hierarchical graph using domain knowledge of hydrological basins to encode these processes in a latent feature sequence using an encoder-processor-decoder style graph neural network. The subsequent recurrent neural network then forecasts TWSA from the latent feature sequence and 12-month history of TWSA for up to six months ahead. The gridded product of the seasonal forecast of global TWSA shows improvement over a seasonal long-term mean.

Li, F., Kusche, J., Sneeuw, N., Siebert, S., Gerdener, H., Wang, Z., Chao, N., Chen, G., and Tian, K.: Forecasting Next Year’s Global Land Water Storage Using GRACE Data, Geophys. Res. Lett., 51, e2024GL109101, https://doi.org/10.1029/2024GL109101, 2024.

How to cite: Steidl, V. and Zhu, X. X.: Hierarchical Graph Networks for ForecastingTerrestrial Water Storage Anomalies, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18659, https://doi.org/10.5194/egusphere-egu26-18659, 2026.

EGU26-19872 | Posters on site | NP4.2

Extending foundation models from weather to climate: challenges and promises 

Fanny Lehmann, Riccardo Neumarker, Gabriele Scorrano, Yun Cheng, Salman Mohebi, Firat Ozdemir, Junyang Gou, Oliver Fuhrer, Torsten Hoefler, Siddhartha Mishra, Mathieu Salzmann, Sebastian Schemm, and Benedikt Soja

AI weather models and weather-based foundation models have demonstrated impressive skills in short- to medium-range forecasts. While most weather models become unstable on longer time scales, a wide variety of AI climate emulators have been proposed, raising questions about the fundamental differences between these approaches.

In this work, we compare state-of-the-art models when producing rollouts on annual time scales. We quantify and characterize different types of instability: smoothing, visual artifacts, drift, and loss of seasonality. This analysis highlights the previously unreported stability of the Aurora foundation model and the Earth System Foundation Model (ESFM) for rollouts longer than 35 years.

To encompass more diverse representations of possible states of the Earth, ESFM is pretrained on a variety of CMIP6 datasets from the historical period, in addition to the ERA5 reanalysis commonly used in AI models. ESFM also includes climate forcings for physically driven long rollouts. We demonstrate the benefits of CMIP6 pretraining when finetuning on new CMIP6 datasets, including datasets with higher resolution, unseen physical processes, and climate change scenarios.

Overall, this work opens perspectives to adapt large-scale pretrained foundation models to the specific challenges of climate projections.

How to cite: Lehmann, F., Neumarker, R., Scorrano, G., Cheng, Y., Mohebi, S., Ozdemir, F., Gou, J., Fuhrer, O., Hoefler, T., Mishra, S., Salzmann, M., Schemm, S., and Soja, B.: Extending foundation models from weather to climate: challenges and promises, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19872, https://doi.org/10.5194/egusphere-egu26-19872, 2026.

EGU26-20131 | Orals | NP4.2

Solving the lack of data issue for machine learning for rare climate events 

Amaury Lancelin, Freddy Bouchet, Alexander Wikner, Pedram Hassanzadeh, Laurent Dubus, and Peter Werner

Machine learning is reshaping the entire climate-modelling chain, from climate model development to the study of climate extreme events and their impacts. One of the key drivers of this revolution is the availability of datasets that are sufficiently large for training and validation. For climate extreme events, however, this requirement poses seemingly insurmountable challenges: we need to assess the impacts of unprecedented events for which historical data are too scarce; we must rely on models, yet simulating extremely rare events with them is prohibitively expensive; and any statistical approach, including machine learning, suffers from a severe lack-of-data problem.

Here, we argue that the only viable path forward is to integrate machine learning directly into the data-generation process, in close interaction with state-of-the-art physics-based climate models and observational datasets.

The first building block of our approach is the development of state-of-the-art climate model emulators. AI models trained on historical reanalyses to emulate the dynamics of the global atmosphere have demonstrated both high forecast skill and drastically reduced computational costs. Some of these AI emulators can generate stable trajectories spanning multiple decades, which, combined with their affordability, has the potential to significantly reduce uncertainties related to extreme weather. However, it remains impossible to directly validate whether AI emulators can reliably estimate the risk of extreme events with return times exceeding the historical record. To address this issue, we develop a methodology based on state-of-the-art architectures, with the explicit requirement that emulators exhibit extremely long-term stability, high fidelity, and a faithful reproduction of the stationary statistics of the climate model.

In a first-of-its-kind experiment, we simulate 100,000 years of a stationary climate using PlaSim, a coarse-resolution general circulation model. We then train a set of stable AI emulators using only 100 years of data, and compare the return times of extreme heat waves over Western Europe and the Pacific Northwest, as well as severe precipitation events over the Tropics.

The second building block of our approach consists of rare-event simulation techniques that reduce by several orders of magnitude the computational cost of sampling extremely rare events with CMIP-class climate models. The third building block is the blending of historical observations with CMIP model output within a Bayesian framework to estimate the

probability of extremely rare events constrained by observations. In this talk, we also briefly discuss the second and third building blocks and their connections to the first within a comprehensive, integrated framework.

How to cite: Lancelin, A., Bouchet, F., Wikner, A., Hassanzadeh, P., Dubus, L., and Werner, P.: Solving the lack of data issue for machine learning for rare climate events, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20131, https://doi.org/10.5194/egusphere-egu26-20131, 2026.

EGU26-21143 | Posters on site | NP4.2

Weather and Climate: Applications of Machine Learning and Artificial Intelligence 

Simon Driscoll, Kieran Hunt, Laura Mansfield, Ranjini Swaminathan, Hong Wei, Eviatar Bach, and Alison Peard

We demonstrate software and tools for users to progress from machine learning theory, probabilistic methods, through to construction of AI models across environmental science. We span basic AI methods through to modern generative AI methods, physics informed techniques, as well as including a vast array of concrete applications such as river discharge modelling, ocean-wave emulation, environmental monitoring, AI foreasting and more. Throughout we place emphasis on how and when these methods should be used, as well as their limitations. This allows users to develop a non-naive understanding of AI and to engage with all themes of Machine Learning Across Earth System Modeling: Subgrid-Scale Parameterizations, Emulation and Hybrid Modeling.

How to cite: Driscoll, S., Hunt, K., Mansfield, L., Swaminathan, R., Wei, H., Bach, E., and Peard, A.: Weather and Climate: Applications of Machine Learning and Artificial Intelligence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21143, https://doi.org/10.5194/egusphere-egu26-21143, 2026.

EGU26-21279 | ECS | Posters on site | NP4.2

Geometry- and Physics-Aware Dataset Creation for Shadow Removal in High-Resolution Satellite Imagery 

Lorenzo Beltrame, Jules Salzinger, Phillipp Fanta-Jende, Jasmin Lampert, Pascal Leon Thiele, Filip Svoboda, Radu Timofte, and Marco Körner

Shadows cast by terrain and tall structures are a persistent limitation in satellite imagery, since they degrade radiometric consistency and compromise downstream tasks such as classification, detection, and 3D reconstruction. In this context, machine learning methods for shadow removal provide a flexible and easy-to-deploy tool to assist satellite remote sensing tasks.  Nevertheless, one prominent issue for its development in Earth Observation (EO) is the scarcity of publicly available, geometry-consistent paired shadowed/shadow-free satellite data. Most EO resources support shadow detection or 3D modelling but not shadow removal, while existing shadow-removal datasets largely target ground-level or UAV imagery and do not reflect multi-date, multi-angle satellite acquisition.

To address this gap, we present deSEO, a physics-informed, geometry-aware methodology that converts anyinto paired training data for weakly supervised satellite shadow removal. We exemplified our procedure on the S-EO satellite dataset. Using the multi-temporal, multi-geometry S-EO dataset (WorldView-3 imagery with DSM priors, simulated shadow masks, and RPC camera models), deSEO selects a minimally shadowed acquisition per tile as a proxy reference and pairs it with more shadowed dates under explicit temporal and geometric constraints. Residual off-nadir parallax is mitigated through orientation normalisation and feature-based registration (LoFTR + RANSAC), yielding a per-pixel validity mask that can be used to restrict model supervision to reliably aligned regions.

To validate the usability of the shadow-removal dataset derived from S-EO, we first adapted UAV-oriented methods such as SRNet and pix2pix. However, these approaches fail to converge to a stable training regime under the viewpoint variability typical of satellite acquisitions. We therefore develop a more robust method and training strategy that mitigates this common failure mode of image-to-image translation on multi-date, multi-geometry satellite imagery. Our approach involves training a DSM-conditioned conditional GAN with a U-Net-based generator. The model incorporates perceptual reconstruction and mask-constrained adversarial objectives, with a soft shadow-mask attention prior that emphasises shadow-transition regions. These enhancements overcome the limitations of the classical GAN image translation setup that worked well for UAV data. We evaluate the model on a held-out test split, where the proposed approach achieves a PSNR of 18 ± 1 dB, SSIM of 0.49 ± 0.08, and LPIPS of 0.46 ± 0.05. Notably, improvements were most pronounced at cast-shadow boundaries, and ablation studies revealed that DSM conditioning was the dominant contributing factor, something absent in the SRNet model.

Overall, deSEO provides a reproducible approach to derive paired supervision for satellite shadow removal and establishes a geometry-aware baseline for robust deshadowing under realistic EO acquisition variability.

How to cite: Beltrame, L., Salzinger, J., Fanta-Jende, P., Lampert, J., Thiele, P. L., Svoboda, F., Timofte, R., and Körner, M.: Geometry- and Physics-Aware Dataset Creation for Shadow Removal in High-Resolution Satellite Imagery, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21279, https://doi.org/10.5194/egusphere-egu26-21279, 2026.

EGU26-21434 | ECS | Orals | NP4.2

Quantum-inspired machine learning for efficient and reliable weather forecasting  

Osama Ahmed, Sallar Ali Qazi, and Luca Magri

Recent advances in data-driven weather forecasting have demonstrated skill at medium-range lead times, yet often rely on extremely large models, massive training datasets, and substantial computational resources. In this talk, we present a novel quantum-inspired machine learning (QIML) approach for sub-seasonal weather forecasting that prioritizes computational efficiency and dynamical stability, while retaining competitive predictive skill.

First, by using quantum circuits ansätze and entanglement, we design scalable quantum reservoir computing models. The implemented model is parallelizable across multiple GPUs and runs on classical hardware in a quantum-inspired setting. Second, we train our model on ERA-5 reanalysis data for 2m temperature, multiple pressure levels, and precipitation on a global grid. We show that, using an encoder-decoder architecture in conjunction with the proposed QIML model, we demonstrate forecasts of key atmospheric variables up to 45 days ahead. Third, we benchmark our model against state-of-the-art AI for weather forecasting methods and show that the QIML model can produce reliable forecasts for weather and climate extremes, while requiring 10-50X less compute.  Fourth, replacing conventional neural architectures with quantum-inspired circuit dynamics enables enhanced physical interpretability and consistency, as the model state evolves according to Schrödinger-type dynamics. We further analyze the learned latent representations using operator-theoretic and spectral tools, revealing coherent structures associated with dominant atmospheric modes.

This work proposed a novel direction to the growing ecosystem of hybrid ML physics approaches by offering a new class of lightweight, stable, and scalable forecasting models that can be deployed efficiently for localized and resource-constrained settings. 

How to cite: Ahmed, O., Qazi, S. A., and Magri, L.: Quantum-inspired machine learning for efficient and reliable weather forecasting , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21434, https://doi.org/10.5194/egusphere-egu26-21434, 2026.

EGU26-22145 | ECS | Orals | NP4.2

Seamless Storm Surge Prediction Using a Surrogate Hydrodynamic Model Based on Long Short-Term Memory Networks 

Villy Mik-Meyer, Francisco C. Pereira, Morten Andreas Dahl Larsen, Jian Su, and Martin Drews

Accurate storm surge prediction is essential for reducing the risks associated with extreme sea levels and for supporting early warning and preventive measures. Physically based numerical models continue to improve in skill and resolution, but their high computational cost limits their use in large ensembles and long-term scenario analyses. Recent advances in machine learning offer a complementary pathway for efficient storm surge forecasting. Here, a machine-learning framework is developed, calibrated, and validated to predict extreme sea levels in the North Sea and Baltic Sea. The model is based on 58 years of spatially distributed wind data and uses a Long Short-Term Memory (LSTM) architecture to capture the temporal dynamics driving water level variability. Compared to traditional physically based hydrodynamic models, the machine-learning approach requires only a fraction of the computational resources, enabling rapid probabilistic and large-ensemble forecasts across large domains and extended time periods. This efficiency is particularly valuable for climate change research, where large ensembles are generally needed to address the combined uncertainty of climate and hydrodynamic models but remain computationally prohibitive using conventional approaches. By providing a scalable and resource-efficient alternative, this framework enables consistent storm surge prediction across timescales ranging from short-term forecasting to long-term climate projections over decades.

How to cite: Mik-Meyer, V., Pereira, F. C., Larsen, M. A. D., Su, J., and Drews, M.: Seamless Storm Surge Prediction Using a Surrogate Hydrodynamic Model Based on Long Short-Term Memory Networks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22145, https://doi.org/10.5194/egusphere-egu26-22145, 2026.

EGU26-22628 | ECS | Posters on site | NP4.2

A Comparative Evaluation of Grid-Invariant Deep Learning Surrogate Models for Wildfire Simulation 

Matheu Boucher, Jidan Zhang, Christopher Pain, Yueyan Li, Aniket Joshi, Ben Moseley, and Philip Cunningham

As climate change drives more extreme wildfire behavior, accurate and computationally efficient fire spread modeling is increasingly critical for monitoring, mitigation, and risk assessment. Wildfires pose a particularly challenging modeling problem due to their complex interactions with fuels, terrain, and atmospheric conditions, as well as their potential to impact populated regions with severe environmental, economic, and human consequences. These challenges motivate the development of surrogate modeling approaches capable of emulating physics-based wildfire simulations at substantially reduced computational cost. In this work, we present a systematic comparison of two deep learning surrogate model architectures for spatiotemporal wildfire emulation: a convolutional neural network-based generative model and a conditional diffusion model. Both approaches are designed to be grid-invariant and trained to predict three key wildfire variables – time of arrival, flame length, and burn scar – at fixed 15-minute time steps. Model performance is evaluated using an autoregressive rollout procedure in which successive short-term predictions are recursively fed back as inputs to simulate wildfire evolution over 12-hour time horizons. The training data consists of wildfire simulations generated using a Rothermel-based fire spread model with realistic, satellite-derived fuel distributions over the western United States (California and Nevada). Evaluation is performed on geographically distinct fire scenarios to assess generalization across diverse fuel configurations. Both surrogate models are shown to produce stable and physically plausible wildfire dynamics over 12-hour autoregressive rollouts while reducing inference time relative to physics-based solvers. The results highlight the potential of deep generative surrogates to enable rapid ensemble-based risk assessment and support operational fire management workflows under diverse environmental conditions.

How to cite: Boucher, M., Zhang, J., Pain, C., Li, Y., Joshi, A., Moseley, B., and Cunningham, P.: A Comparative Evaluation of Grid-Invariant Deep Learning Surrogate Models for Wildfire Simulation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22628, https://doi.org/10.5194/egusphere-egu26-22628, 2026.

Generative super-resolution (SR) reconstruction models are widely applied in digital rock research to balance the 
trade-off between image resolution and the scanning device’s field of view. Existing methods often enhance 
visual details or structural fidelity separately. However, they fail to balance these goals effectively. This failure 
frequently leads to artifacts that distort porosity and permeability measurements. This paper proposes the Sta
tionary and Discrete Wavelet-Enhanced Generative Adversarial Network (SDWGAN). The model is a hybrid SR 
approach that integrates two wavelet decomposition methods. This integration addresses the challenge effec
tively. By integrating multi-scale frequency constraints from wavelet decomposition with adversarial training 
focused on high-frequency components, our method effectively distinguishes rock boundary details from imaging 
artifacts. The proposed model adopts a global-local feature integration architecture to preserve fine-grained 
textures and macroscopic structures. Experimental results on the DeepRock-SR dataset (carbonate, sandstone, 
coal) demonstrate SDWGAN’s enhancements: 0.63–2.12 dB PSNR and 0.01–0.11 SSIM improvements in fidelity, 
alongside 0.001–0.005 LPIPS and 0.62 NIQE gains in perceptual quality over RGB-domain loss-based models. 
Simulated seepage results indicate that SDWGAN estimates porosity and permeability with 98 % similarity to the 
reference images. In conclusion, the proposed model manages the perception-distortion trade-off via frequency 
domain optimization, ensuring petrophysical consistency between SR results and benchmarks. This approach 
offers a novel and reliable method for reservoir characterization in the field of petroleum geology.

How to cite: Ma, G. and Wang, Z.: Multiscale wavelet-adversarial learning eliminates imaging artifacts in digital rock analysis for reliable reservoir evaluation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-576, https://doi.org/10.5194/egusphere-egu26-576, 2026.

EGU26-1394 | ECS | Orals | ERE5.7

Physics informed neural networks based on mixed pressure-velocity formulation for flow in heterogeneous aquifers 

Adhish Guli Virupaksha, Marwan Fahs, Thomas Nagel, Francois Lehmann, and Hussein Hotiet

Physics-Informed Neural Networks (PINNs) have emerged as a promising paradigm for solving problems governed by partial differential equations (PDEs) using the flexibility and generalization capability of deep learning. By embedding the governing physical laws directly into the training process, PINNs can approximate complex physical systems even when limited or no observational data are available. However, their performance and convergence can deteriorate significantly in domains characterized by high heterogeneity or discontinuities in material properties. In particular, standard PINN formulations tend to enforce implicit continuity in the hydraulic conductivity field, which can lead to inaccurate representations of physical processes in heterogeneous porous media.

This study introduces a novel and robust PINN framework for modelling transient fluid flow in heterogeneous porous media, with specific emphasis on accurately handling discontinuities in the hydraulic conductivity field. The proposed approach is based on a mixed formulation of the governing flow equations, in which the pressure and velocity fields are represented by independent neural networks. This structural separation eliminates the need to compute spatial derivatives of discontinuous or non-differentiable quantities during the evaluation of the loss function. As a result, the method achieves a more stable and accurate application of automatic differentiation while maintaining strong adherence to the underlying physical principles.

Furthermore, to address the high computational cost typically associated with training PINNs, a discrete-time mixed formulation is developed. By discretizing the temporal domain, this approach reduces the dimensionality of the problem, leading to substantial savings in both memory usage and training time. Despite these efficiency gains, the discrete-time PINN retains a high level of accuracy and fidelity in predicting transient flow dynamics in heterogeneous domains.

Comprehensive testing on various scenarios of unconfined aquifers demonstrate that the proposed implementation outperforms standard PINN approaches when applied to porous media with strong contrasts in hydraulic conductivity. The results obtained from the different PINNs techniques have been compared against the results from finite element software COMSOL to analyze their performance.

Overall, the mixed formulation PINN frameworks are computationally more efficient, and produce results with improved accuracy compared to the standard PINNs technique for simulating fluid flow in complex porous media systems, representing a significant step forward in the application of deep learning to subsurface modelling.

How to cite: Virupaksha, A. G., Fahs, M., Nagel, T., Lehmann, F., and Hotiet, H.: Physics informed neural networks based on mixed pressure-velocity formulation for flow in heterogeneous aquifers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1394, https://doi.org/10.5194/egusphere-egu26-1394, 2026.

EGU26-1788 | ECS | Orals | ERE5.7

Early Warning of Fault Reactivation through Passive Acoustic Emission in Samples Analogous to Carbon Storage Reservoir 

Rafael Mesquita, Nathaniel Forbes Inskip, Milad Naderloo, Auke Barnhoorn, Florian Doster, and Andreas Busch

Climate change drives urgent action in decarbonisation, and carbon capture and storage (CCS) has emerged as a crucial technology in mitigating greenhouse gas emissions. Large-scale subsurface CO2 injection carries the inherent risk of inducing fault reactivation and microseismic events, which could compromise the project. To optimise CCS projects while mitigating these geological hazards, passive acoustic emission (AE) monitoring offers a real-time method to detect initial fracture activity before failure.

In this study, triaxial compression experiments were conducted on reservoir-analogue sandstone sample plugs. Intact samples were axially loaded under an initial confining pressures (Pci) with continuous passive AE recording. A shear fracture was then induced in each sample, which was subsequently re-sheared under different confining pressure regimes (Pc) to mimic fault reactivation. Two porosity groups (~20% and 26%) were tested to evaluate deformation effects on AE response. Acoustic sensors at the sample ends captured the P-wave signals throughout each loading cycle, and the AE events were analysed in conjunction with the mechanical stress-strain data. From these mechanical data, failure envelopes were derived to assess the applicability of failure criteria. The results show that the Mohr–Coulomb criterion provides good agreement with all tests conducted and that fractured specimens may exhibit friction angles different from intact rock while retaining a non-zero cohesion, which should not be neglected when modelling fractured reservoirs for CCS.

The acoustic emission results reveal clear precursor patterns to fracture slip. For intact samples, axial loading triggered intense AE activity from the outset, reflecting micro-cracking and particle rearrangement. In contrast, samples with pre-existing fractures showed an initially low rate of emissions, increasing significantly just before the peak stress. Notably, higher-porosity samples generated roughly an order of magnitude more emissions than lower-porosity samples during both the initial fracturing and the reactivation phases, and consequently a much higher cumulative acoustic energy release.

Crucially, the cumulative AE record revealed a distinct acoustic precursor to failure. During re-shearing, the cumulative event count initially increased steadily, then underwent a sudden acceleration (an identifiable inflection point) shortly before the peak stress. This surge in event rate consistently occurred when the sample was still below its peak strength, signalling imminent failure. Such a signal could serve as an early warning. In a field injection scenario, detection of this acoustic inflection would allow operators to adjust injection rates or pressures before fault reactivation. Incorporating passive AE monitoring in this way could enhance CCS safety by optimising operations and preventing induced seismicity.

How to cite: Mesquita, R., Forbes Inskip, N., Naderloo, M., Barnhoorn, A., Doster, F., and Busch, A.: Early Warning of Fault Reactivation through Passive Acoustic Emission in Samples Analogous to Carbon Storage Reservoir, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1788, https://doi.org/10.5194/egusphere-egu26-1788, 2026.

EGU26-2164 | Posters on site | ERE5.7

Research on the Generation Method of an Intrusion Detection Dataset for SCADA Systems in Geothermal Well Scale Inhibitor Injection 

Hongzhou Sun, Yong Zhang, Hanping Wan, Kai Wei, Qianfeng Shui, and Honghui Wang

In the production phase of geothermal resource development and utilization, the SCADA system for geothermal well scale inhibitor injection plays a critical role in scale prevention. A cyberattack on this SCADA system can result in production data anomalies and equipment damage, triggering a cascading failure: the inhibitor injection may be interrupted, leading to wellbore scaling and a reduction in thermal energy supply. As this impact propagates to the geothermal plant, it can reduce power generation, triggering voltage and frequency fluctuations in the grid that ultimately threaten power supply security. Currently, deep learning-based network security protection technologies have become an effective means to address these threats. However, the lack of high-quality, scenario-specific datasets restricts the effectiveness of this approach. Therefore, this paper aims to develop a method for generating a network intrusion detection dataset for the SCADA system of geothermal well scale inhibitor injection. Specifically, first, a geothermal well SCADA network testbed that closely aligns with the real process was constructed. On this testbed, multi-dimensional network attack experiments—covering scanning, denial-of-service (DoS), ARP spoofing, and man-in-the-middle (MitM) attacks—were systematically conducted to simulate threat scenarios with different origins, stealth levels, and scopes. Subsequently, network traffic data under both normal and attacked conditions were collected. The raw traffic was parsed and subjected to feature engineering, and data labeling was completed based on the alignment between attack logs and timestamps. Ultimately, we generated a dataset that contains over 25 million training samples and 2.5 million test samples. Based on this dataset, we conducted benchmark training and evaluation on four mainstream deep learning models: DNN, CNN, LSTM, and Transformer. The experimental results demonstrate that the generated dataset exhibits good learnability and can effectively support the training of different deep learning models. This study not only addresses the scarcity of specialized datasets in this field but also provides a reliable experimental foundation and evaluation benchmark for subsequent cybersecurity research in geothermal energy systems.

How to cite: Sun, H., Zhang, Y., Wan, H., Wei, K., Shui, Q., and Wang, H.: Research on the Generation Method of an Intrusion Detection Dataset for SCADA Systems in Geothermal Well Scale Inhibitor Injection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2164, https://doi.org/10.5194/egusphere-egu26-2164, 2026.

EGU26-2780 | Posters on site | ERE5.7

Source and Evolution of Thermal Fluids in the Lushan Geothermal Field 

Sheng-Rong Song, Yi-Ching Wang, Ting-Jui Song, and Yi-Chia Lu

Geochemically, the oxygen isotope values range from −7.3‰ to −10.7‰, and hydrogen isotope values range from −72.6‰ to −57.2‰. Most data plot along the meteoric water line, indicating a dominant meteoric origin, while a small number of samples deviate slightly from this line, possibly reflecting fluid fractionation associated with boiling. An integrated three-dimensional geothermal geological model was constructed using: (1) surface DEM data, (2) regional geological maps and cross-sections, (3) borehole core descriptions and lithologic logs, (4) 3-D MT data, and (5) well temperature measurements. The Lishan Fault, located on the western margin of the Lushan geothermal area, is a highly active fault that has created a favorable fracture network serving as conduits for meteoric water infiltration and as conditions for geothermal reservoir development. Combined with previously acquired MT profiles across central Taiwan, the data reveal a low-resistivity zone extending upward from depth in the southwestern region along the Lishan Fault and spreading eastward into the Lushan geothermal area. This indicates that the primary heat/fluid source of the Lushan geothermal system is derived from deep circulation originating in the southwestern subsurface of the region.

Veins in the Lushan geothermal area are dominated by quartz veins, with minor occurrences of calcite veins. Based on field occurrences, the veins can be classified into three successive stages: (1) quartz veins parallel to slaty cleavage with homogenization temperatures between 220 and 300 °C, and salinities ranging from 5.7 to 9.1 wt.%, (2) quartz veins cutting across slaty cleavage with temperatures mainly between 220 and 290 °C, with salinities of 4.0–8.0 wt.%, and (3) euhedral to subhedral crystals infilling fractures and pores, yielding homogenization temperatures mostly between 220 and 300 °C, with salinities of 3.1–9.7 wt.%, whereas calcite-hosted fluid inclusions show lower homogenization temperatures of 150–210 °C and salinities of 1.0–5.7 wt.%. Comparison of fluid inclusion temperatures indicates that similarly high homogenization temperatures were attained during all three stages. No clear correlation is observed between temperature and salinity, and the salinity distributions are comparable among different stages. These features suggest the presence of a stable brine source constrained by synclinal structures in the region. The fluids are inferred to originate from a persistent deep heat source beneath Chunyang, where they were heated at different depths before ascending and precipitating mineral veins during successive tectonic episodes.

How to cite: Song, S.-R., Wang, Y.-C., Song, T.-J., and Lu, Y.-C.: Source and Evolution of Thermal Fluids in the Lushan Geothermal Field, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2780, https://doi.org/10.5194/egusphere-egu26-2780, 2026.

EGU26-3529 | ECS | Orals | ERE5.7

Mapping the geothermal potential of Italy’s shallow subsoil: a streamlined MILS model approach for extensive datasets 

Gabriela Squarzoni, Francesca Colucci, and Martina Aiello

Shallow geothermal energy represents a significant opportunity to reduce energy waste in the heating and cooling sectors. Geothermal maps are a valuable tool for enhancing the exploitation of geothermal resources on a national scale. In this work, we produced maps of Italian shallow geothermal potential for different saturation and groundwater velocity scenarios. For this purpose, we developed a method for quickly computing geothermal potential, based on lithological data and the simplified application of the Moving Infinite Line Source model for heat dispersion. Our approach follows the G.POT methodology proposed by Casasso and Sethi in 2016, but it also incorporates the contribution of groundwater flow, which was not considered in the original G.POT computation. This method enables the computation of geothermal potential from a large amount of input data, given the geological asset, the thermal and hydrogeological properties of the materials that form the subsoil, the initial underground temperature, and the required thermal load. Using this approach, we estimated the geothermal potential related to more than 28.000 sites for which stratigraphic data are available. We gather the stratigraphic logs of every site and compute the geothermal potential for each lithological layer encountered in each log. The derived values have been averaged to obtain the mean potential of the shallow subsoil at a reference depth of approximately 70 m. The final maps are the result of interpolating the point estimates. The different scenarios explore the variability of the geothermal field as it is intrinsically linked to the geological and hydrogeological context. From completely unsaturated to completely saturated conditions, the geothermal potential can increase by a factor that ranges from 4 to 10, depending on the groundwater flow velocity. The regions showing larger increments related to groundwater action are those characterized by sandy or gravelly subsoils, such as Emilia-Romagna, Piedmont, Lombardy, Friuli-Venezia Giulia, and Veneto. The high permeability of these sediments strongly influences their geothermal potential. On the other hand, areas where consolidated rock prevails are less susceptible to variation due to the presence of water in the underground soils, as observed in some regions of Sardinia, Sicily, and Apulia. Both the final maps and selected intermediate results have been published on open-access data platforms managed by Ricerca sul Sistema Energetico - RSE S.p.A, which also host a wide range of other energy-related information to support territorial energy planning.

How to cite: Squarzoni, G., Colucci, F., and Aiello, M.: Mapping the geothermal potential of Italy’s shallow subsoil: a streamlined MILS model approach for extensive datasets, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3529, https://doi.org/10.5194/egusphere-egu26-3529, 2026.

This study evaluates the field applicability of a multi-resolution Convolutional Long Short-Term Memory (ConvLSTM) framework for predicting time-series retaining wall deformations during staged excavation using field measurements from various excavation sites in South Korea. The proposed framework integrates three ConvLSTM models trained with different temporal input resolutions to capture deformation characteristics at multiple time scales. Their multi-step predictions are subsequently refined using a stacking ensemble strategy with a neural network–based meta-learner, which mitigates error accumulation commonly observed in recursive long-horizon forecasting and enhances overall prediction stability and accuracy.

To generate a comprehensive training database, numerical analyses were conducted on a wide range of excavation cross-sections with varying final excavation depths, wall tip restraint conditions, and initial groundwater levels, reflecting diverse geotechnical and structural configurations. The geotechnical and structural properties were defined probabilistically to account for inherent uncertainties in ground conditions and structural stiffness. In total, 4,000 numerical analysis cases were generated and further augmented into 16,000 training datasets through Gaussian noise injection to improve model generalization ability.

For field validation, 34 time-series displacement measurements collected from 11 excavation sites in South Korea were employed to assess the predictive performance of the proposed framework under real construction conditions. When lateral displacement data obtained from earlier excavation stages were provided as inputs, the model predicted retaining wall deformation induced by an additional excavation depth of 5.0 m, achieving an average coefficient of determination (R²) of 0.85 and a mean absolute error (MAE) of 5 mm. Furthermore, the framework demonstrated an average inference time of 0.92 s, confirming its suitability for near–real-time prediction and potential integration with field monitoring systems. These results indicate that the proposed multi-resolution ensemble framework is practically applicable to real-world excavation projects and offers a robust tool for predictive decision-making in excavation safety management.

 

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2023R1A2C1007635).

How to cite: Kim, J. and Youn, H.: AI-Driven Time-Series Prediction of Retaining Wall Deformation: A Case Study in Korea, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3675, https://doi.org/10.5194/egusphere-egu26-3675, 2026.

Understanding and predicting complex environmental and hydrosystem processes is a central challenge in Earth system science. These systems are governed by interacting physical mechanisms across scales, are only partially observed, and are often characterized by limited data and substantial uncertainty. As a result, machine learning (ML) has emerged along two complementary development paths for environmental modeling.

In a first branch, physical process models remain the backbone of simulation, while ML is employed as a surrogate to approximate expensive numerical solvers. Surrogate modeling approaches based on Gaussian process emulators, polynomial chaos expansions, support vector regression, and related probabilistic representations are particularly well suited for data-poor settings. Neural networks are used more selectively in this context, as uncertainty-aware and sample-efficient methods are often preferred. In surrogate modeling, considerable effort is devoted to optimal sampling strategies, including active learning, which adaptively select informative simulations and help preserve scarce computational resources. These surrogate models enable efficient uncertainty quantification, sensitivity analysis, and Bayesian inference, while preserving physical interpretability.

A second, increasingly important branch emerges when physical models are incomplete, unavailable, or deliberately omitted, and ML models replace the governing equations altogether. This branch is most commonly based on neural network representations, but has recently also been explored using Gaussian processes and polynomial chaos–based learning concepts. In this setting, purely data-driven learning is insufficient, as unconstrained models tend to violate physical principles and extrapolate poorly. In this second branch, physical principles such as conservation laws or balance relations are embedded directly into learning architectures. Complementarily, constraint-driven learning strategies enforce physical laws, admissibility conditions, and structural consistency during training. By restricting the hypothesis space, these methods stabilize learning and support robust inference under incomplete physical knowledge.

Taken together, surrogate modeling for physics-based simulations and physics-aware ML for equation-free learning represent two coherent and complementary branches of modern environmental machine learning. We observe a growing convergence between these two branches, as physics-based surrogate modeling and equation-free machine learning increasingly borrow concepts from each other. This convergence is not accidental, but a direct response to fundamental model limitations and the challenge of making reliable predictions under scarce data and knowledge constraints. By integrating physics, probabilistic reasoning and constraints, emerging approaches increasingly focus on robustness and interpretability rather than unconstrained expressive power.

How to cite: Oladyshkin, S.: Physics-Aware Learning for Environmental Systems: Surrogate Modeling and Constraint-Driven Machine Intelligence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3804, https://doi.org/10.5194/egusphere-egu26-3804, 2026.

EGU26-4056 | ECS | Posters on site | ERE5.7

Information-Theoretic Bayesian Active Learning for Surrogate Training and Inverse Modeling in Subsurface Transport Applications 

Maria Fernanda Morales Oreamuno, Tim Brünnette, Stefania Scheurer, Sergey Oladyshkin, and Wolfgang Nowak

Running detailed, physics-based numerical simulations of subsurface transport is often computationally expensive. This becomes a challenge when calibrating models against observed data using methods that require a large number of model runs, such as Bayesian inference. To address this challenge, surrogate models are frequently used to approximate simulation outputs. Surrogates are trained using input-output pairs generated by the physics-based model. Traditional approaches typically rely on space-filling designs that uniformly cover the entire parameter space. However, for high-dimensional problems, this becomes impractical and tends to waste computational resources on parameter regions that are either physically irrelevant or contradict available measurement data.

To overcome these limitations, we utilize a Bayesian Active Learning (BAL) framework that iteratively selects training points most informative for Bayesian inference given available measurements. We employ Gaussian Processes and Bayesian-Polynomial Chaos Expansions as surrogates, which provide probability distributions for their predictions. Our approach takes advantage of these predictive distributions to evaluate candidate training points using information-theoretic criteria. To account for measurement uncertainty and prevent the algorithm from over-sampling local likelihood maxima, we investigate different strategies for representing observations within the selection process. These criteria are integrated into a multi-objective scoring function that balances global exploration (reducing surrogate uncertainty) with targeted exploitation (refining high-likelihood regions). Additionally, we demonstrate how observations from early time steps can iteratively guide the selection of training points to improve predictive accuracy for later, critical periods of the transport process.

We test this method on analytical benchmarks and on subsurface transport models. The framework is evaluated in terms of convergence speed and posterior accuracy relative to existing active learning strategies and reference solutions derived from the full physics-based model. Overall, the proposed goal-oriented strategy aims to reduce the number of expensive model evaluations required for surrogate training, improving the efficiency of subsurface characterization, model calibration and predictive modeling.

How to cite: Morales Oreamuno, M. F., Brünnette, T., Scheurer, S., Oladyshkin, S., and Nowak, W.: Information-Theoretic Bayesian Active Learning for Surrogate Training and Inverse Modeling in Subsurface Transport Applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4056, https://doi.org/10.5194/egusphere-egu26-4056, 2026.

EGU26-4411 | Posters on site | ERE5.7

From Pore to Core: Multi-Scale Evidence of Underground Hydrogen Storage Stability After Three Months of Hydrogen Exposure Under Reservoir Conditions 

Lin Ma, Heather Braid, Kevin Taylor, Edward Hough, and Chris Rochelle

Underground hydrogen storage (UHS) is a cornerstone technology for net-zero energy systems, offering terawatt-hour capacity to buffer renewable intermittency. Although many experiments have been reported on hydrogen flow in porous rocks, robust evidence for long-duration reactions and impact on transport under combined high temperature and high pressure remains limited, leaving a critical uncertainty around reservoir stability during seasonal storage.

Here we provide firm, multi-scale pre/post experimental constraints on two major onshore UK candidate aquifers—the Triassic Sherwood Sandstone Group and the Cretaceous Lower Greensand Group—after ~3 months exposure to H₂ at simulated in-situ conditions deep underground, 50 °C and 150 bar. We integrate X-ray computed tomography (3D pore–grain architecture and bulk phase fractions), optical petrography (fabric/facies), SEM imaging (micro-textures and fines), and XRD (mineralogy) to resolve hydrogen impacts across scales. We also performed dynamic synchrotron images of hydrogen flows in the porous rocks to investigate the reaction impact on the transport. We performed systematically investigations on the pore networks, grain framework, or mineralogy, porosity and permeability. The results show  the pore network changes varied by <5%, consistent with measurement uncertainty. Only a single localised fines-migration feature (likely pyrite grain displacement) was detected, without associated dissolution/precipitation signatures. Quartz-dominated frameworks (>~65 wt%) appear inert under these conditions, while facies-scale heterogeneity governs pore connectivity and is expected to dominate injectivity and withdrawal behaviour. These results reduce a key uncertainty for UHS in silicate-rich sandstones, support prioritising connected macro-porous facies in site screening and well placement, and provide a transferable workflow for rapid hydrogen–rock interaction assessment and monitoring. Future work should extend to potentially more reactive lithologies, cyclic operation, longer exposure, and bio-active systems, in order to complete risk evaluation for large-scale seasonal storage.

How to cite: Ma, L., Braid, H., Taylor, K., Hough, E., and Rochelle, C.: From Pore to Core: Multi-Scale Evidence of Underground Hydrogen Storage Stability After Three Months of Hydrogen Exposure Under Reservoir Conditions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4411, https://doi.org/10.5194/egusphere-egu26-4411, 2026.

EGU26-5372 | ECS | Orals | ERE5.7

Physics-based and data-driven machine learning modeling of saturated and unsaturated hydraulic conductivity of bentonite 

Muntasir Shehab, Reza Taherdangkoo, and Butscher Christoph

Accurate prediction of the hydraulic conductivity of compacted bentonite is critical for assessing the long-term safety of high-level radioactive waste repositories, where barrier efficiency depends on coupled processes. This study develops a data-driven machine learning model to predict saturated hydraulic conductivity and a physics-based machine learning model to predict unsaturated hydraulic conductivity of compacted bentonite. For the saturated hydraulic conductivity prediction, a dataset of 215 experimental measurements was compiled, incorporating key soil properties such as montmorillonite content, specific gravity, plasticity index, initial water content, dry density, and temperature as input. To predict unsaturated hydraulic conductivity, the study considers experimental data, synthetic data generated using the Van Genuchten model, and outputs from the machine learning model developed for saturated hydraulic conductivity. The input dataset includes specific gravity, montmorillonite content, initial dry density, initial water content, initial void ratio, plasticity index, and suction. The AdaBoost, CatBoost, and XGBoost algorithms were used to train the machine learning models, and the whale optimization algorithm was used for hyperparameter tuning. The trained machine learning models demonstrate good predictive performance for both saturated and unsaturated hydraulic conductivity of compacted bentonite, showing close agreement with experimental measurements.

How to cite: Shehab, M., Taherdangkoo, R., and Christoph, B.: Physics-based and data-driven machine learning modeling of saturated and unsaturated hydraulic conductivity of bentonite, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5372, https://doi.org/10.5194/egusphere-egu26-5372, 2026.

EGU26-6206 | ECS | Orals | ERE5.7

CO2 Saturation Prediction from Historical Time-Lapse Seismic Data Using Physics-Constrained VideoMAE 

Man Tang, Zhaoyun Zong, and Diqiong Jiang

Geological carbon storage is a key strategy for mitigating global CO2 emissions, and reliable monitoring of subsurface CO2 migration is critical for storage safety. Time-lapse seismic provides valuable insights into CO2 plume evolution. However, accurately predicting high-resolution CO2 saturation from seismic data remains a major challenge. In this study, we propose a novel physics-constrained deep learning framework that treats time-lapse seismic data as video sequences and leverages the Video Masked Autoencoder (VideoMAE) architecture to capture spatial and temporal dependencies. The approach consists of two stages: self-supervised pretraining on seismic data and supervised fine-tuning for CO2 saturation prediction. During pretraining, masked reconstruction enables the model to extract rich spatiotemporal feature representations from seismic videos. In fine-tuning, the pretrained model is adapted to predict future CO2 saturation from historical time-lapse seismic data without requiring seismic data from the target year. A physical constraint based on Fick’s law of diffusion is incorporated into the loss function to regularize the temporal evolution of CO2 saturation during fine-tuning. Results on the Kimberlina synthetic multiphysics dataset demonstrate that the physics-constrained VideoMAE framework consistently outperforms baseline models in both prediction accuracy and spatial consistency. These findings highlight the effectiveness of combining video-based self-supervised learning with physical constraints for time-lapse seismic monitoring and provide a promising physics-informed approach for CO2 storage surveillance.

How to cite: Tang, M., Zong, Z., and Jiang, D.: CO2 Saturation Prediction from Historical Time-Lapse Seismic Data Using Physics-Constrained VideoMAE, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6206, https://doi.org/10.5194/egusphere-egu26-6206, 2026.

EGU26-6239 | Posters on site | ERE5.7

CO2 Exsolution and Nanobubble Evolution in Sandstone under Cyclic Depressurisation 

Andreas Busch, Amirsaman Rezaeyan, Gernot Rother, Zaid Jangda, Hannah P. Menke, and Kamaljit Singh

Understanding the nucleation, growth, and persistence of CO2 gas phases in water-saturated porous media is critical for predicting fluid transport, trapping efficiency, and integrity in geological CO2 storage systems. Gas exsolution under depressurisation remains poorly constrained at the nano- to micro-scale, where capillarity, confinement, and surface chemistry strongly influence phase behaviour. In this study, we investigate CO2 exsolution from water saturating a clay-rich sandstone using small-angle neutron scattering (SANS) under realistic reservoir conditions, providing direct, in situ insights into gas phase evolution within the pore space.

SANS experiments were conducted at the EQ-SANS instrument at Oak Ridge National Laboratory using a pressure cell allowing for exsolution testing at 50 °C under cyclic depressurisation from 12 MPa to 0.7 MPa. The pore fluid consisted of a contrast-matched H2O–D2O mixture (68:32 vol.%), yielding a stable scattering length density of 4.17 × 1010 cm-2, similar to that of the matrix. The H2O:D2O mixture was saturated with CO2 at 12 MPa and room temperature (~22 °C) prior to controlled pressure reduction. Under these conditions, the scattering signal arises from exsolved CO2 nanobubbles. SANS profiles were obtained continuously during pressure decrease.

The scattering data reveal the emergence and evolution of nanoscale heterogeneities consistent with CO2 gas clusters and nanobubbles forming within pores between 5 and 200 nm. Although phase diagrams predict CO2 exsolution at about 8 MPa and 50 °C, this is only observed at ~2.4 MPa. Changes in scattering intensity and slope indicate pressure-dependent growth and coalescence processes, influenced by pore confinement and clay mineral surfaces. Notably, a progressive loss of scattering signatures associated with pores smaller than ~15 nm during pressure reduction suggests the preferential disappearance of CO2 nanobubbles in the smallest pores. This is potentially driven by Ostwald ripening, whereby gas diffuses from high-curvature, unstable nanobubbles toward larger, more stable gas clusters. Repeated pressure cycling highlights the partial reversibility of exsolution and the persistence of gas features, suggesting potential hysteresis effects relevant for cyclic injection and pressure management strategies.

These findings demonstrate the capability of SANS to resolve nanoscale CO2 exsolution processes in complex geomaterials and provide critical constraints for pore-scale and continuum models of multiphase flow and transport. The results have direct implications for assessing CO2 mobility, trapping mechanisms, and leakage risk in clay-rich storage formations and caprocks under dynamic pressure conditions.

How to cite: Busch, A., Rezaeyan, A., Rother, G., Jangda, Z., Menke, H. P., and Singh, K.: CO2 Exsolution and Nanobubble Evolution in Sandstone under Cyclic Depressurisation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6239, https://doi.org/10.5194/egusphere-egu26-6239, 2026.

EGU26-6683 | ECS | Orals | ERE5.7

Influence of texture on anion-accessible porosity fraction explored by µCT, SEM & N2 adsorption data  

Carmen Zwahlen, Thomas Gimmi, Andreas Jenni, and Raphael Wüst

The anion-accessible porosity fraction (fa) is an important parameter controlling solute transport in claystones. A safe disposal of nuclear waste in such rocks relies on a comprehensive understanding of transport in clays. In stratigraphic sequences, established Cl or Br profiles provide insights into the paleo-hydrogeological evolution. Anion concentrations in the accessible porewater can be calculated from measured bulk porewater concentrations and the anion-accessible porosity fraction fa. Since experimental data for fa are scarce, and data extrapolation within a heterogeneous stratigraphy is challenging due to the fa dependency on multiple parameters (e.g., pore/grain shapes and size distributions), detailed understanding of texture and its influence on macroscopic transport parameters is paramount.

In this study, imaging techniques (µCT and SEM) and other methods (e.g., N2 adsorption) were combined to characterise texture and the pore network of rock samples from Opalinus Clay and confining units. The techniques unravel pore characteristics at different scales: N2 adsorption from nanometers to a micrometer, SEM larger than 50 nm, and µCT larger than a few µm. Samples with different mineralogical compositions, lithologies, and experimentally determined fa for Cl (fCl) were analysed.

Two sand/siltstone samples with different fCl but similar clay content show identical ratios of grains to porous clay regions, but different pore sizes in high-resolution SEM images. This can qualitatively explain the different fCl for these samples. However, SEM cannot resolve small pores (<50nm), and a structural model is additionally required to derive quantitative results.

The gained textural insights make clear that fCl does not necessarily correlate with the clay fraction. Moreover, extended correlations of fCl with quantified textural information allow a better prediction of fCl for formations where this parameter was not measured. The outcome of this study encourages further investigations for verifications such as transmission electron microscopy (TEM) imagery to explore the nanometric pore space within and around clay minerals.

 

[1] C. Zwahlen, T. Gimmi, A. Jenni, M. Kiczka, M. Mazurek, L. R. Van Loon, et al., "Chloride accessible porosity fractions across the Jurassic sedimentary rocks of northern Switzerland," Appl. Geochem., vol. 162, p. 105841, 2024. DOI: 10.1016/j.apgeochem.2023.105841

How to cite: Zwahlen, C., Gimmi, T., Jenni, A., and Wüst, R.: Influence of texture on anion-accessible porosity fraction explored by µCT, SEM & N2 adsorption data , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6683, https://doi.org/10.5194/egusphere-egu26-6683, 2026.

EGU26-6919 | Posters on site | ERE5.7

Experimental study on pore variation and meso-damage of saturated sandstone under unloading condition 

Chi Zhang, Jie Wang, Wenchao Chen, Jianxin Fu, and Weidong Song

The degradation of rock mass strength is macroscopically manifested as a reduction in cohesion and an increase in the internal friction angle. Microscopically, it manifests as the propagation of internal fractures, which is also the fundamental cause of rock mass damage and deterioration. The complex mesoscopic fracture structure within the rock mass directly influences its macroscopic mechanical properties and failure characteristics. To more accurately understand the mechanical behavior of rock masses under unloading conditions, it is essential to investigate the internal mesoscopic fracture structure of the rock and its impact on the overall mechanical properties.

To study the crack propagation and meso-damage evolution of saturated sandstone under unloading (unloading confining pressure), triaxial unloading confining pressure tests were designed and conducted on sandstone samples under different initial axial pressures (70%, 80%, and 90% of the triaxial compressive strength, TCS). The results indicate that samples with higher initial axial pressure exhibit larger axial strain and smaller radial strain at unloading failure. As the unloading confining pressure ratio increases, the elastic modulus gradually decreases, while Poisson's ratio and strain gradually increase.

Using 1H Nuclear Magnetic Resonance (NMR) technology, the variations in rock porosity and T2 spectrum curves were analyzed. The T2 spectral peaks show that pore size increases with the unloading confining pressure ratio, and a higher initial axial pressure leads to more significant pore size growth. Porosity increases exponentially with the unloading confining pressure ratio. Within this trend, the number of micropores continuously increases, whereas the numbers of mesopores and macropores first decrease and then increase. The initial axial pressure promotes the development and expansion of pores.

The fractal characteristics of the T2 spectrum were analyzed, and the relationship between the degree of damage and the unloading confining pressure ratio was established. The variation trends of rock pore characteristics, energy, and damage degree are generally consistent. Finally, based on damage mechanics theory, a damage constitutive model for rock under loading and unloading conditions was developed. The overall correspondence between the theoretical model predictions and the experimental curves is satisfactory.

How to cite: Zhang, C., Wang, J., Chen, W., Fu, J., and Song, W.: Experimental study on pore variation and meso-damage of saturated sandstone under unloading condition, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6919, https://doi.org/10.5194/egusphere-egu26-6919, 2026.

EGU26-7708 | Orals | ERE5.7

Integrated Geophysical Characterization of Hot Dry Rock Resources in the Gonghe Basin, Qinghai, China 

sheng lian, zhengpu cheng, qiang wei, and linyou zhang

Hot dry rock (HDR) represents a promising form of clean and renewable geothermal energy, with substantial global potential to support the transition to low-carbon energy systems. Among the most prospective regions for HDR development in China is the Gonghe Basin in Qinghai Province. However, the basin's complex subsurface geological conditions present significant challenges for the accurate assessment of HDR resources.

This study proposes a multi-scale integrated geophysical framework for HDR characterization, combining gravity, magnetic, magnetotelluric (MT), ambient noise tomography, and time-frequency electromagnetic methods. Multi-source geophysical datasets were systematically processed, calibrated with available borehole data, and interpreted through inversion modeling to construct a three-dimensional geological-geophysical model of the study area.

The results reveal the spatial distribution, burial depth, and thermal-structural properties of HDR reservoirs, identifying a high-potential zone with reservoir temperatures exceeding 200 °C. The integrated approach effectively addresses the limitations of individual geophysical methods, significantly enhancing the accuracy of HDR reservoir identification and parameter estimation. This study demonstrates the feasibility and effectiveness of integrated geophysical techniques in HDR exploration, offering a robust technical basis for future development in the Gonghe Basin and similar geothermal environments worldwide.

How to cite: lian, S., cheng, Z., wei, Q., and zhang, L.: Integrated Geophysical Characterization of Hot Dry Rock Resources in the Gonghe Basin, Qinghai, China, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7708, https://doi.org/10.5194/egusphere-egu26-7708, 2026.

EGU26-7749 | ECS | Orals | ERE5.7

Lithology Segmentation from Well Logs for Geothermal Exploration using Vision Foundation Models 

Ning Qian, Felix Jagert, and Monica Sester

Former oil and gas fields offer a repository of historical geophysical well logs that can help support geothermal exploration across large areas. Lithology classification from logging data is a fundamental task in subsurface geological interpretation. Existing deep learning approaches typically formulate this problem as a point-wise or sequence-wise classification task, where logging curves are treated as one-dimensional depth-dependent signals. Although such methods have demonstrated promising performance, they usually rely on large-scale labelled datasets for training. Moreover, logging datasets commonly exhibit severe class imbalance due to complex geological environments and strong heterogeneity, which further degrades the performance and robustness of data-hungry deep learning models.

To address these challenges, we propose a novel lithology segmentation framework, in which we reformulate lithology classification as a semantic segmentation task, where different lithological units are characterized by continuous intervals separated by distinct boundaries along the depth dimension. Based on this formulation, we develop a lithology segmentation framework that leverages large-scale vision foundation models, enabling effective learning under data-scarce and class-imbalanced conditions. Our core motivation is to transfer the strong image representation and generalization capabilities learned by large pretrained models on massive image data to the geological logging domain.

Specifically, well logging curves are transformed into two-dimensional pseudo-images by a structured multi-scale channel combination along the depth dimension. The repetition factor k controls how many times each logging curve is duplicated in the pseudo-image, enabling Vision Transformer (ViT) with fix-sized patches to encode logging patterns at multiple effective scales. For each scale k, a composite representation X(K)∈ RH×WK  is formed by repeating selected logging curves with scale-dependent repetition factors, where H is the number of depth samples. Accordingly, the width of the pseudo-image at scale k is defined as Wk = k·N, where N is the number of logging curves. The final input representation X is obtained by concatenating all scale-specific representations: X = Concat(X(1), X(2), X(2), ..., X(K)).

Building upon the pretrained Segment Anything Model (SAM), we retain the image encoder to extract high-level visual features, while a task-specific decoder is initialized and trained from scratch for lithology segmentation. The encoder weights are initially frozen and gradually unfrozen during training, and fine-tuned jointly with the decoder to adapt the feature space to the geological patterns of the specific domain. This staged training strategy stabilizes the optimization process, reduces overfitting with limited data, and effectively transfers knowledge from natural images to well logging images. Furthermore, by using a weighted loss function at the segmentation level to address class imbalance, it ensures that a minority of lithological classes contribute sufficiently to model updates.

Overall, the proposed framework demonstrates a new workflow for lithology interpretation by integrating foundation models with geological data analysis. It provides a data-efficient solution for lithology segmentation under realistic constraints of limited and imbalanced well logging datasets.

How to cite: Qian, N., Jagert, F., and Sester, M.: Lithology Segmentation from Well Logs for Geothermal Exploration using Vision Foundation Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7749, https://doi.org/10.5194/egusphere-egu26-7749, 2026.

  Deep carbonate gas reservoirs represent a crucial frontier in natural gas exploration. However, their strong heterogeneity and complex pore structures often lead to technical challenges and low recovery rates. The Dengying Formation in the Penglai gas area, Sichuan Basin, characterized by typical vuggy, fracture-vuggy, and porous reservoir types, serves as an ideal focus for addressing these issues. Nevertheless, conventional core flooding experiments lack in-situ visualization and real-time monitoring capabilities, making it difficult to characterize dynamic fluid migration at the microscopic scale. Therefore, establishing a new experimental methodology is urgently needed.Moreover,This experiment employed CO2 as the displacement gas. The injection of CO2 into deep carbonate formations enables the underground storage of greenhouse gases, realizing carbon sequestration with substantial environmental benefits.

  In this study, typical carbonate samples from the Dengying Formation were selected to conduct high-temperature and high-pressure (HTHP) physical simulation experiments of SCCO2 displacing methane (CH4) using X-ray Computed Tomography (X-CT). A complete experimental workflow covering "formation water saturation, gas charging, and SCCO2 displacement" was established, along with a quantitative parameter system. Through real-time online monitoring, fluid migration patterns and displacement characteristics were quantitatively analyzed based on CT images and CT number variations.

  The results indicate that: (1) Fracture-vuggy reservoirs exhibit the best displacement performance under high pressure, with the sweep volume of SCCO2 expanding progressively over time. (2) In fracture-dominated reservoirs, SCCO2 tends to migrate along preferential "fracture-vug" pathways under high pressure, leading to gas channeling (fingering) and low sweep efficiency; optimizing the pressure differential (reducing displacement rate) can effectively mitigate channeling and improve matrix mobilization. (3) Vuggy reservoirs have a high mobilization threshold, requiring a higher pressure gradient and longer displacement duration, with the sweep zone expanding gradually. (4) Porous (tight-matrix) reservoirs show the poorest performance; due to narrow throats and poor connectivity, high seepage resistance prevents significant saturation changes or displacement fronts from being observed in CT images.

  This study reveals the microscopic mechanisms of SCCO2 displacing gas under different carbonate pore structures and clarifies the control of heterogeneity on displacement efficiency, providing theoretical support for Enhanced Gas Recovery (EGR) and CO2 sequestration in deep carbonate reservoirs.

Keywords: Deep carbonate reservoir; X-CT scanning; SCCO2-EGR; Physical simulation; Dengying Formation

How to cite: Zhang, Z., Liu, J., and Liu, K.: Microscopic Mechanism of SCCO2 Displacing CH4 in Deep Carbonate Gas Reservoirs Based on X-CT Scanning: A Case Study of the Dengying Formation, Penglai Gas Area, Sichuan Basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8605, https://doi.org/10.5194/egusphere-egu26-8605, 2026.

EGU26-11018 | ECS | Posters on site | ERE5.7

A Python Multi-Algorithm Optimization Framework for Automated Parameter Calibration in OpenGeoSys 

Reza Mahmoudi Kouhi, Reza Taherdangkoo, Thomas Nagel, and Christoph Butscher

Parameter calibration remains a critical bottleneck in coupled thermo–hydro–mechanical–chemical simulations, particularly when parameters are strongly coupled and non-unique solutions exist. In OpenGeoSys (OGS), calibration is frequently performed by manual trial-and-error, resulting in workflows that are subjective, difficult to reproduce, and unsuitable for systematic comparison of calibration strategies. These limitations become especially pronounced in multiphysics settings, where equifinality can mask parameter sensitivity and bias interpretation.

This study presents a non-intrusive, reusable Python framework for automated parameter calibration in OGS that treats the simulator as a black-box forward model. The framework controls the complete calibration workflow externally, including parameter sampling within defined bounds, automated execution of OGS simulations, extraction of user-defined parameters from output files, and quantitative misfit evaluation using different metrics. A total of twelve optimization algorithms are integrated, spanning local deterministic methods, surrogate optimization, population and swarm based approaches, and hybrid strategies. All algorithms are accessed through a unified configuration interface, enabling direct and fair benchmarking under the same evaluation metrics.

The framework is evaluated using an axisymmetric hydro-mechanical borehole benchmark with prescribed pressure and stress histories. Intrinsic permeability and Young’s modulus are jointly calibrated against a reference mass-flow time series, with each optimization method limited to approximately 100 forward simulations. The results demonstrate that calibration performance is governed primarily by misfit reduction efficiency per simulation rather than algorithmic overhead. Population-based methods robustly identify favorable regions of the parameter space, local search methods exhibit rapid convergence near optimal solutions, and hybrid strategies consistently combine both strengths. The proposed framework provides a reproducible and objective basis for parameter calibration in OpenGeoSys, enabling the development of more reliable models for coupled multiphysics applications.

How to cite: Mahmoudi Kouhi, R., Taherdangkoo, R., Nagel, T., and Butscher, C.: A Python Multi-Algorithm Optimization Framework for Automated Parameter Calibration in OpenGeoSys, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11018, https://doi.org/10.5194/egusphere-egu26-11018, 2026.

EGU26-11359 | ECS | Orals | ERE5.7

From Data-Driven to Physics-Informed Surrogate Models for Reactive Nitrate Transport 

Alireza Arab, Traugott Scheytt, Thomas Nagel, and Reza Taherdangkoo

Reactive nitrate transport in groundwater is governed by coupled advection–dispersion–reaction (ADR) dynamics and kinetically limited redox processes, including donor limitation and competition among electron acceptors. We compare two surrogate modeling approaches for reactive nitrate transport. The first is a physics-audited, data-driven approach based on a categorial boosting algorithm, with physical admissibility (e.g., non-negativity and ADR-consistent behavior) assessed via post-hoc diagnostics. The second is a physics-informed neural network (PINN) surrogate that embeds the ADR equation, boundary conditions, non-negativity, and a redox-ordering constraint directly into the training objective to promote mechanistic consistency. Both surrogates are trained and tested on the same one-dimensional PHREEQC benchmark suite spanning increasing hydrogeochemical complexity: linear denitrification, dual-linear nitrate–Fe(III) competition, dual-substrate Monod kinetics, and fully coupled dual-Monod redox systems. Predictive uncertainty is quantified to provide calibrated confidence bounds and identify regions of elevated sensitivity.

Results show that while both surrogates can interpolate reactive nitrate dynamics within the training domain, the PINN surrogate consistently provides superior physical consistency and robustness under increasing kinetic nonlinearity. Uncertainty estimates from the PINN are well calibrated, with prediction-interval widths increasing systematically near migrating reactive fronts where nonlinear redox competition amplifies model sensitivity. The results demonstrate that embedding governing physics directly into the learning process yields a more reliable and interpretable surrogate for uncertainty-aware reactive transport modeling, particularly in regimes dominated by nonlinear kinetics and competing redox pathways.

How to cite: Arab, A., Scheytt, T., Nagel, T., and Taherdangkoo, R.: From Data-Driven to Physics-Informed Surrogate Models for Reactive Nitrate Transport, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11359, https://doi.org/10.5194/egusphere-egu26-11359, 2026.

EGU26-11701 | ECS | Posters on site | ERE5.7

Utilising geodata for enhancing success rate of geothermal projects in Finland 

Tuija Luhta, Annu Martinkauppi, Aino Karjalainen, and Viveka Laakso

Shallow geothermal wells for heating individual houses have been utilised successfully in Finland for decades. Recently, deep geothermal wells (down to six kilometres) and medium deep wells (500–3 000 metres) have been piloted for district heating, regional-scale applications in urban areas, and industrial building heating, with varying degrees of success.

Bedrock in Finland consists mostly of Precambrian crystalline rocks. Ancient bedrock is cold and fractured. Several geothermal projects have been delayed, shortened or even cancelled due to challenges in drilling or insufficient heat production. In Geoenergialoikka (Geoenergy Leap) project, the Geological Survey of Finland (GTK) has been developing a workflow to integrate geodata for the site selection of medium deep geothermal wells and consequently for estimating heat production, aiming to improve the success rate of geothermal projects.  

The workflow has been developed while planning and implementing three medium deep geothermal wells (600–800 m) in Kotka, Oulu and Kokkola. Existing geodata has been utilised to determine the locations of the proposed geothermal wells and to assess drilling risks. The datasets include geodata available from GTK’s Hakku service, e.g. geological and aerogeophysical maps, Lidar and lineament data, as well as site specific geophysical survey data.  The cost-effectiveness of different data analyses and survey methods has been evaluated, and best practices for utilizing geodata in medium deep geothermal projects will be proposed.

Geoenergialoikka is co-funded by the European Union’s Just Transition Fund (JTF), the councils of Central Ostrobothnia, North Ostrobothnia, and Kymenlaakso, and the project partners: Geological Survey of Finland GTK (the coordinator), Centria University of Applied Sciences, Oulu University of Applied Sciences OAMK, University of Oulu, and South-Eastern Finland University of Applied Sciences XAMK. The project aims to speed up the comprehensive use of geothermal energy, strengthening national energy self-sufficiency and supply security, and impacting regional employment positively.

How to cite: Luhta, T., Martinkauppi, A., Karjalainen, A., and Laakso, V.: Utilising geodata for enhancing success rate of geothermal projects in Finland, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11701, https://doi.org/10.5194/egusphere-egu26-11701, 2026.

EGU26-11715 | Orals | ERE5.7

EGRISE 2.0: the knowledge at the fingertips to power Research and Innovation in Geothermal Energy 

Eugenio Trumpy, Alessia Bardi, and Adele Manzella

The geothermal sector generates a vast amount of knowledge—from research project data to scientific publications, technical reports, patents, and open datasets—produced by scientists, operators, consultants, public authorities, and funding agencies. However, this wealth of information is often scattered across multiple repositories and platforms, which hampers effective access, integration, and utilization. EGRISE 2.0, developed within the EU-funded Geotherm-FORA project, addresses this challenge as the largest thematic repository for geothermal research and innovation in Europe. The platform aggregates information from EU-funded projects, open access publications, scientific journals, and public datasets hosted on repositories such as Zenodo and Pangaea. Each research product is indexed with detailed metadata, enabling users to search, filter, and explore thousands of documents—currently over 11,000—by criteria such as publication type, funder, country, year, language, or resource access.

By consolidating this vast body of knowledge and facilitating its exploration, EGRISE 2.0 allows stakeholders to precisely map the state of R&D in the geothermal industry. Researchers can spot emerging trends, identify gaps, and recognize key contributors, while funding agencies and policymakers can evaluate technological maturity and set priorities for future research and investment. Additionally, the platform facilitates the preparation of innovative project proposals by offering instant access to scientific publications, datasets, and project deliverables.

A set of integrated charts further enhances the platform’s value, offering insights such as publication trends, openness over time, and data FAIRness.

EGRISE is an open tool available at https://egrise.openaire.eu/. It is powered by OpenAIRE CONNECT, a service to build customizable search portals on top of the OpenAIRE Graph, one of the largest open scientific knowledge graph.

In this way, EGRISE 2.0 not only consolidates knowledge but actively empowers innovation, collaboration, and strategic decision-making leveraging on open research information, establishing itself as an indispensable tool for the European geothermal community.

How to cite: Trumpy, E., Bardi, A., and Manzella, A.: EGRISE 2.0: the knowledge at the fingertips to power Research and Innovation in Geothermal Energy, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11715, https://doi.org/10.5194/egusphere-egu26-11715, 2026.

EGU26-11794 | Orals | ERE5.7

Predicting Hydraulic Conductivity of Rock Masses using Machine Learning and Hydro-Jacking Tests – A Case Study of Mohmand Dam 

Zargham Zarrar, Zulfiqar Ali, Zohair Vaseer, Sajid Saeed, and Irfan Khan

Hydraulic conductivity of fractured rock masses is a controlling parameter in dam engineering, governing seepage and grouting performance. In practice, hydraulic conductivity is commonly evaluated using in-situ packer (Lugeon) or commonly known hydro-jacking tests. However, these tests are costly, time consuming and cumbersome, requiring skilled technical staff. Therefore, empirical models are often used to estimate the hydraulic conductivity, which generally rely on a limited number of input variables, and therefore inadequately represent the nonlinear permeability behavior of rock masses. To address these limitations, this study proposes a machine-learning based modeling framework for predicting hydraulic conductivity of fractured rock masses using published data comprising packer test results, hydro-jacking reopening pressures, and geological parameters including depth, rock quality designation (RQD), and fracture characteristics. Hydro-jacking tests are performed at the Mohmand dam site, and the model performance is evaluated against the test data. The results indicate that the machine-learning based model is reliable and can accurately capture hydraulic conductivity in fractured rock masses. The proposed approach offers a reliable alternative to traditional empirical methods and has practical implications for seepage assessment, grouting design, and dam foundation permeability evaluation in complex geological settings.

How to cite: Zarrar, Z., Ali, Z., Vaseer, Z., Saeed, S., and Khan, I.: Predicting Hydraulic Conductivity of Rock Masses using Machine Learning and Hydro-Jacking Tests – A Case Study of Mohmand Dam, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11794, https://doi.org/10.5194/egusphere-egu26-11794, 2026.

EGU26-15580 | Orals | ERE5.7

Microscopic pore structure and macroscopic fluid flow-chemical transport with the coupled thermal-hydraulic-mechanical-chemical processes 

Qinhong Hu, Fang Hao, Yuefeng Xiao, Keyu Liu, Tao Zhang, Yubin Ke, He Cheng, Xiuhong Li, Qiming Wang, Chen Zhao, and Shengyu Yang

Various types of porous media (both unconsolidated and consolidated geological bodies and engineering materials, etc.) and fluids (water, gas, oil, supercritical carbon dioxide, etc.) are closely intertwined with multiple fields such as the environment, geology, and geotechnical engineering, involving soil contamination and groundwater remediation, high-level nuclear waste disposal, carbon dioxide storage, shale oil and gas extraction, hydrogen energy storage, and geothermal utilization. Nano-petrophysical studies focus on rock properties, fluid properties, and the interaction between rocks and fluids, especially for low-permeability geological and engineering media with a large number of nano-scale pores, as their microscopic pore structure (pore size distribution, pore shape and connectivity) controls the macroscopic fluid-rock interaction and the efficient development or preservation of various energy fluids. Such a subsurface system involves a wide range of nm-μm scale pore sizes, various pore connectivity and wettability, in addition to the coupled thermal-hydraulic-mechanical-chemical (THMC) processes of deep earth environments. This work showcases the development and application of an integrated and complementary suite of nano-petrophysical characterization approaches, including pycnometry (liquid and gas), porosimetry (mercury intrusion, gas physisorption), imaging (Wood’s metal impregnation followed with field emission-scanning electron microscopy), scattering (ultra- and small-angle neutron and X-ray), and the utility of both hydrophilic and hydrophobic fluids as well as fluid invasion tests (imbibition, diffusion, vacuum saturation) followed by laser ablation-inductively coupled plasma-mass spectrometry imaging of different nm-sized tracers in porous materials. These methodologies have been extended into coupled THMC processes under reservoir-relevant setting, such as the small-angle scattering (SAS) method developed and utilized for the direct observation of rock deformation behavior at a spatial resolution of 1 nm with stresses up to 164 MPa using a self-developed high-pressure cell for mechanistic studies of fluid-solid coupling. 

Acknowledgement: This work was supported by the Basic Science Center Program of the National Natural Science Foundation of China (NSFC) (Type A; No. 42302145) and the International Cooperation Project of PetroChina (2023DQ0422).

How to cite: Hu, Q., Hao, F., Xiao, Y., Liu, K., Zhang, T., Ke, Y., Cheng, H., Li, X., Wang, Q., Zhao, C., and Yang, S.: Microscopic pore structure and macroscopic fluid flow-chemical transport with the coupled thermal-hydraulic-mechanical-chemical processes, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15580, https://doi.org/10.5194/egusphere-egu26-15580, 2026.

Carbonate terrains in central Saudi Arabia are prone to subsurface hazards due to karstification, fracturing, and differential weathering, posing significant risks for large infrastructure developments. This research presents an integrated geophysical and geotechnical investigation carried out for the proposed construction site at Riyadh, located along the Wadi Hanifah escarpment. The site is underlain by highly to moderately weathered Jurassic limestone of the Shaqra Group, characterized by vugs, fractures, and solution-filled discontinuities.

Multichannel Analysis of Surface Waves (MASW) was employed to map subsurface stiffness variations and identify potential cavities and weak zones. A total of 1400 linear meters of MASW profiles were acquired using a 24-channel system with 2.5 m geophone spacing, achieving an investigation depth of up to approximately 25 m. Shear-wave velocity (Vs) sections were generated through dispersion analysis and inversion of surface-wave data. The interpreted Vs values range from about 200 m/s to 3500 m/s, where higher velocities (>1500 m/s) represent competent limestone, while lower velocities (<1000–1500 m/s) indicate fractured, weathered, or solution-affected zones.

MASW results delineated several localized low-Vs anomalies corresponding to solution-filled vugs and cavities at depths ranging from approximately 1 m to 13.5 m. These geophysical findings were correlated with borehole data from fifteen geotechnical boreholes, including rock coring, RQD measurements, pressuremeter testing, and laboratory strength testing. Borehole logs confirm the presence of highly fractured limestone with variable RQD (0-100%) and unconfined compressive strength values between about 13 and 65 MPa. Zones identified as weak in MASW sections coincide with intervals of low RQD, poor core recovery, and solution-filled fractures observed in the boreholes.

The integrated interpretation demonstrates that MASW is an effective tool for rapid detection and lateral mapping of karst-related weak zones in limestone terrains when calibrated with geotechnical data. The results provided critical input for foundation design, ground improvement planning, and risk mitigation at the site. This study highlights the value of combining surface-wave geophysics with conventional geotechnical investigations for sustainable and safe development in karst-prone regions.

How to cite: Jadoon, M. A. and Jadoon, K. Z.: Integrated Geophysical and Geotechnical Investigation for Detection of Karstic Weak Zones in Limestone Terrain, Riyadh, Saudi Arabia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15860, https://doi.org/10.5194/egusphere-egu26-15860, 2026.

EGU26-17391 | ECS | Posters on site | ERE5.7

Rapid simulation of Aquifer Thermal Energy Storage using transformer-based Machine Learning 

Hadrian Fung, Issac Ju, Carl Jacquemyn, Meissam Bahlali, Matthew Jackson, and Gege Wen

Aquifer Thermal Energy Storage (ATES) offers sustainable, low carbon heating and cooling to the built environment.  Optimising the design and operation of ATES installations requires numerical simulation of groundwater flow and heat transport in heterogeneous aquifers.  These simulations are typically computationally expensive: high spatial resolution is required to accurately resolve pressure, flow and temperature fields; moreover, high temporal resolution may be necessary to control numerical diffusion and/or resolve frequent changes in injection flowrate and temperature. Simulations of (1) systems that utilize multiple well doublets, or (2) capture interactions between neighbouring systems, are particularly challenging.  Multiple simulations may be required to quantify the impact of uncertain aquifer heterogeneity.  Yet the time available for aquifer modelling in many commercial projects is very limited.  Rapid but accurate approaches to simulate subsurface flow and heat transport in ATES and other shallow geothermal deployments are urgently required.

Machine Learning (ML) offers a rapid alternative to conventional numerical simulation of complex subsurface flow and transport processes.  Here we introduce the use of a transformer-based ML approach, on a purely data-driven basis, to significantly increase simulation efficiency whilst retaining its accuracy.   The ML proxy is trained using ATES simulation outputs from the open-source Imperial College Finite Element Reservoir Simulator (IC-FERST), that uses dynamic mesh optimization to provide high solution accuracy at lower computational cost.  The practical consequence here is that the mesh changes across solution snapshots recorded at successive time steps used for training.  Conventional Convolutional Neural Network (CNN)-based models require a fixed mesh.  Here, to provide a fast proxy, we implement atransformer-based model working on adaptive unstructured mesh, enabling a stronger capability in capturing long range changes in predictions. The model can take in the initial state of the reservoir in arbitrary input mesh, perform one-step prediction in non-physical latent space and recover the latent representation of the prediction back to physical space on any given query mesh, allowing the integration of adaptive mesh refinement adjusted to fit the predicted solution on unstructured graphs.
Our results suggest a promising approach to rapid simulation of ATES, in which simulation time can be reduced significantly with a speed-up factor of over 6600 times.

How to cite: Fung, H., Ju, I., Jacquemyn, C., Bahlali, M., Jackson, M., and Wen, G.: Rapid simulation of Aquifer Thermal Energy Storage using transformer-based Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17391, https://doi.org/10.5194/egusphere-egu26-17391, 2026.

EGU26-17559 | Posters on site | ERE5.7

heatflow.world: A FAIR, Quality-Controlled Global Platform for Heat-Flow Data and Geothermal Applications 

Ben Norden, Samah Elbarbary, Elif Balkan-Pazvantoğlu, Alexey Petrunin, Marios Karagiorgas, Florian Neumann, Renée Bernhard, Achim Kopf, Kirsten Elger, Sam Jennings, Nikolas Ott, Stephan Mäs, and Sven Fuchs

Heat-flow data are a critical input for geothermal exploration, lithospheric studies, and assessments of the global heat budget. Despite decades of measurements, their reuse has been hampered by heterogeneous or incomplete metadata, inconsistent quality assessment and documentation, and limited interoperability between regional and global compilations. To address these limitations, we present the new European heat-flow compilation as part of the World Heat Flow Database, now served through the www.heatflow.world platform as its new digital home. The European dataset comprises more than 14,000 heat-flow determinations from approximately 8,000 locations, including complementary data (e.g., underlying rock properties, measured temperature gradients, site-specific effects), and covering measurements acquired between 1939 and 2025. The dataset strictly follows a unified metadata schema and quality evaluation framework developed by the International Heat Flow Commission. This framework evaluates heat-flow determinations along three independent dimensions: methodological robustness, numerical uncertainty, and environmental or site-specific perturbations. These dimensions are combined into a transparent, reproducible quality score that supports objective comparison, automated filtering, and informed  reusable data structure.  Our analysis demonstrates that high-quality heat-flow data are available across most European regions, although the spatial density of data remains uneven. Importantly, data quality shows no systematic dependence on the year of measurement, underlining the long-term value of well-documented legacy data when embedded in a modern, quality-controlled framework. By integrating the European compilation into the World Heat Flow Database and publishing it via heatflow.world, regional datasets become interoperable components of a continuously expanding, standardised global resource. The heatflow.world platform is designed to follow the FAIR data principles, providing findable, accessible, interoperable, and reusable heat-flow data, grids, and maps for both academic and applied users. The online interface of the World Heat Flow Portal supports transparent data citation, community-driven updates, and long-term sustainability, thereby establishing a robust foundation for future geothermal exploration and global thermal studies.

How to cite: Norden, B., Elbarbary, S., Balkan-Pazvantoğlu, E., Petrunin, A., Karagiorgas, M., Neumann, F., Bernhard, R., Kopf, A., Elger, K., Jennings, S., Ott, N., Mäs, S., and Fuchs, S.: heatflow.world: A FAIR, Quality-Controlled Global Platform for Heat-Flow Data and Geothermal Applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17559, https://doi.org/10.5194/egusphere-egu26-17559, 2026.

EGU26-19147 | Orals | ERE5.7

Pore space architecture and water binding state in clay-rich rocks 

Thomas Gimmi, Martin Mazurek, and Katja Emmerich

Clays and clay rocks are relevant materials in many natural or engineered systems. Particles and pores in these materials are very small, which results in very low permeabilities. Accordingly, clays or clay rocks are considered as sealing materials or as host rocks for the safe disposal of hazardous waste in the underground.

The architecture of the pore space, i.e., the pore size distribution and the pore connectivity, are fundamental characteristics that define macroscopic properties of these materials, such as water retention function, hydraulic conductivity, diffusion coefficients, or the mechanical behavior. Unfortunately, the resolution of imaging techniques is often insufficient for a direct visualization of all pores in clays, and mostly indirect methods have to be used. Moreover, porewater close to charged clay surfaces may be partly bound, and this can also affect hydraulic conductivities.

We applied a range of different methods (Hg injection, N2 and H2O ad-/desorption, simultaneous thermal analysis coupled with evolved gas analysis STA-EGA) to characterize the pore space architecture and physical properties of porewater of a set of twelve very different rocks. We addressed the following questions: (1) Is porewater close to solid surfaces more strongly bound compared to porewater far from surfaces? (2) Are physical porewater properties related to basic properties of the clay rocks, such as clay-mineral content, cation exchange capacity, or the pore solution composition?

When comparing pore size distributions derived from the above methods and from NMR cryoporometry (Fleury et al., 2022), we see that mostly similar size ranges are obtained, but specific peaks should not be overinterpreted. Only NMR cryoporometry allows measurements at the original saturation state (except for high salinity solutions), which minimizes potential artefacts from drying. During STA, mainly water was released up to ~200°C (heating rate 10°C/min) in all samples. Vaporization enthalpy distributions derived from the STA data – indicators of water binding states – are unimodal in many cases, meaning that no clearly distinct water populations exist. However, the width of the distributions varied considerably among the samples. Comparably narrow distributions with a main peak in the region of bulk water vaporization enthalpies were seen for samples with relatively large pores, and wider or very wide distributions for samples with small pores, complex pore networks, higher surface charge concentration per volume of pore water, or high salinity pore solutions. The latter demonstrates that the derived vaporization enthalpies do not only reflect surface interactions, but also interactions with solutes. Finally, the partly large differences in the energetic state of the porewater should be considered as a relevant pore-scale feature when trying to derive macroscopic hydraulic parameters.

Fleury, M., T. Gimmi, M. Mazurek (2022). Porewater content, pore structure and water mobility in clays and shales from NMR methods. Clays Clay Miner. 70, 417–437, https://doi.org/10.1007/s42860-022-00195-4

How to cite: Gimmi, T., Mazurek, M., and Emmerich, K.: Pore space architecture and water binding state in clay-rich rocks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19147, https://doi.org/10.5194/egusphere-egu26-19147, 2026.

EGU26-20127 | ECS | Posters on site | ERE5.7

Shallow Geothermal Potential Mapping Incorporating Groundwater Effects Based on a 3D Geological Model: Hsinchu, Taiwan 

Hua-ting Tseng, Sheng-Che Hsu, Wei-Chin Huang, Chia-Lun Wang, and Hwa-Lung Yu

Hsinchu is one of the most densely concentrated high-tech industrial cities in Taiwan and hosts the headquarters of Taiwan Semiconductor Manufacturing Company (TSMC). With the rapid development of artificial intelligence technologies, energy demand has increased sharply, highlighting the need for reliable and sustainable energy resources to mitigate the escalating power consumption. Shallow geothermal energy is a renewable resource that remains underutilized in Taiwan. Hsinchu City is located on Quaternary alluvial deposits characterized by a shallow groundwater table and relatively high groundwater flow velocities, which may provide favorable hydrogeological conditions for the utilization of shallow geothermal energy. This study aims to evaluate the shallow geothermal energy potential of Hsinchu City. The research begins with the construction of a three-dimensional geological model using an advanced geostatistical approach, namely the Bayesian Maximum Entropy (BME) method. The developed model provides spatially distributed information on subsurface thermal properties and groundwater dynamics. Subsequently, an analytical heat-transfer model, the Moving Infinite Line Source (MILS) model, is employed to back-calculate the maximum accessible heat extraction rate under two constraints: environmental impact limits and engineering design limits. The evaluation scenarios consider a 20-year operational period for vertical borehole heat exchanger systems under seasonal variations in groundwater depth and flow velocity. This preliminary assessment provides valuable insights into the feasibility and potential of shallow geothermal energy development and offers a scientific basis for future energy-saving strategies in the Hsinchu Science Park and surrounding industrial areas.

How to cite: Tseng, H., Hsu, S.-C., Huang, W.-C., Wang, C.-L., and Yu, H.-L.: Shallow Geothermal Potential Mapping Incorporating Groundwater Effects Based on a 3D Geological Model: Hsinchu, Taiwan, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20127, https://doi.org/10.5194/egusphere-egu26-20127, 2026.

EGU26-21331 | Posters on site | ERE5.7

Towards Sustainable Energy in Svalbard: Geothermal Heat-Flow Insights from Wireline Logs 

Matthijs Nuus, Kim Senger, Sven Fuchs, Aleksandra Smyrak-Sikora, and Tabea Kubutat

Reliable estimates of thermal conductivity and radiogenic heat production are essential for robust heat-flow calculations and geothermal assessments. In the high Arctic archipelago of Svalbard, geothermal energy is increasingly considered as an alternative to the present diesel-based energy supply. However, direct measurements of thermal properties are limited to shallow, fully cored research boreholes, while the deeper subsurface—where temperatures suitable for geothermal district heating (~80 °C) are reached at depths of ~2 km beneath the settlement of Longyearbyen—remains poorly constrained. In this study, we derive thermal properties for the Silurian (?) to Paleogene sedimentary succession of onshore Svalbard using wireline logs from eight petroleum exploration boreholes drilled to depths of up to 3.3 km. In addition, we include data from two fully-cored research boreholes. Lithology logs were digitized and used as the basis for thermal modeling. Two approaches were applied: (1) assigning generalized thermal properties based on lithology classes, and (2) calculating thermal properties directly from wireline logs, incorporating lithological information. The resulting thermal conductivity estimates range from 0.4 to 4.2 W m⁻¹ K⁻¹ and show strong lithological control. In the uppermost kilometer, calculated thermal conductivities were compared with laboratory measurements from two fully cored boreholes, revealing consistent lithology-dependent trends, although calculated values are generally slightly lower than measured ones. The derived thermal properties were subsequently used as input for 1D heat-flow modeling of the ten boreholes and a hypothetical deep geothermal borehole beneath Longyearbyen. Calculated heat-flow values range between 60 and 147 mW m⁻², with the highest values obtained for the Raddedalen borehole on Edgeøya. Our results demonstrate that wireline-log-derived thermal properties provide a valuable basis for improving heat-flow estimates and enable a more spatially resolved assessment of the thermal state and geothermal potential of Svalbard.

How to cite: Nuus, M., Senger, K., Fuchs, S., Smyrak-Sikora, A., and Kubutat, T.: Towards Sustainable Energy in Svalbard: Geothermal Heat-Flow Insights from Wireline Logs, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21331, https://doi.org/10.5194/egusphere-egu26-21331, 2026.

EGU26-23124 | Posters on site | ERE5.7

Thermal Response Test (TRT) Database Development and Integration for Shallow Geothermal Applications: The FIS-GP Example 

Mehrdad Sardar Abadi, Sven Rumohr, Holger Jensen, Jens Gramenz, Katharina-Maria Kuper, and Thorsten Agemar

Germany’s transition to renewable energy sources places increasing importance on the efficient use of shallow and medium-depth geothermal systems. The WärmeGut project supporting this energy transition, is funded by the Federal Ministry for Economic Affairs and Energy (Bundesministerium für Wirtschaft und Energie). A central element of this initiative is the development of a comprehensive database to systematically record and evaluate geothermal data obtained from Thermal Response Tests (TRTs) and temperature-depth profile measurements.

These measurements are essential for obtaining thermal properties for subsurface characterization, yet historically, the data have been fragmented, inconsistently stored, and often inaccessible to practitioners and researchers. The creation of a dedicated TRT Database addresses these gaps by enabling standardized data collection, quality control, and long-term storage, thereby supporting more reliable planning and simulation of geothermal systems.

To maximize its impact and usability, a key solution proposed is the integration of this TRT data module into the existing Geophysics Information System (https://www.fis-geophysik.de), a platform managed by the LIAG – Institute for Applied Geophysics. The Geophysics Information System currently provides structured access to a wide range of geophysical measurements and preliminary subsurface evaluations, such as underground temperature profiles. Incorporating TRT data will enhance the system’s value by linking thermal performance insights with broader geological and geophysical contexts.

Ultimately, this effort supports more informed decision-making in geothermal energy development across Germany, fosters research synergies, and contributes to the national goals of energy efficiency and climate resilience.

How to cite: Sardar Abadi, M., Rumohr, S., Jensen, H., Gramenz, J., Kuper, K.-M., and Agemar, T.: Thermal Response Test (TRT) Database Development and Integration for Shallow Geothermal Applications: The FIS-GP Example, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23124, https://doi.org/10.5194/egusphere-egu26-23124, 2026.

EGU26-1739 | ECS | Posters on site | HS3.8

Missing data imputation in epidemiology: a comparison between MICE and Machine Learning methods 

Mahmoud Hashoush, Emmanuelle Cadot, and Franco Alberto Cardillo

Missing data represents a challenge in large-scale epidemiological studies as it can introduce a strong and negative bias in the final estimates when not handled appropriately. This issue is particularly relevant in environment health research due to complex relationships between the exposure to risk factors and delayed outcomes. In this work, we evaluate the effectiveness of statistical and Machine Learning (ML) approaches to fill in missing values in data we collected to assess the potential impact on public health of gold mining activities in the Ecuadorian Amazon.

There is growing concern regarding the adverse effects on human health in the Ecuadorian Amazon caused by the environmental impact of gold mining activities in the area. To investigate potential associations with adverse birth outcomes, we collected data published by the Ecuadorian National Institute of Statistics and Census (INEC) relative to the annual live birth and fetal death cases in the years from 2014 to 2023. As it is typical in large-scale epidemiological studies, the data contain a proportion of missing values, likely related to the registration and the data entry process. 

Addressing missing values is considered important for the correct assignment of cases from one hand and the characterisation of risk factors from another. Furthermore, it enables the modelling process when searching for associations between exposure and outcome without erroneous under- or over-reporting of odds ratios (Type I and Type II errors). Currently, the most common approach in epidemiology is to use statistical methods and, specifically, Multivariate Imputation by Chained Equations (MICE), normally instantiated with parametric conditional models. MICE imputes missing values by repeatedly predicting each incomplete variable from the others using standard regression models. In most applications, these predictions rely on linear or generalised linear relationships between variables. This can reduce its effectiveness in predicting missing values in presence of complex, non-linear interactions about variables. Machine Learning represents an interesting alternative as it capture complex, non-linear relationships beyond the linear models typically assumed in MICE, are more flexible with respect to departures from missing-at-random patterns, and reduce the risk of model misspecification by relying on data-driven, implicit model selection rather than requiring the analyst to pre-specify an imputation model.

In this study, we present a robust experimental comparison between MICE and several ML-based imputation approaches applied to the Ecuadorian birth data. We assess their performance and discuss the respective strengths and limitations within an epidemiological context.

How to cite: Hashoush, M., Cadot, E., and Alberto Cardillo, F.: Missing data imputation in epidemiology: a comparison between MICE and Machine Learning methods, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1739, https://doi.org/10.5194/egusphere-egu26-1739, 2026.

The lack of extensive and functional ground observation networks introduces satellite-based rainfall products as an alternative. However, these datasets require prior evaluation. This study investigates the performance of four satellite- and gauge-based rainfall products: the Climate Hazards Group Infrared Precipitation with Station data version v2.0 (CHIRPS); Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks Climate Data Record (PERSIANN); Tropical Applications of Meteorology using Satellite data and ground-based observations (TAMSAT); and the Global Precipitation Climatology Centre full daily data (GPCC).

The assessment was conducted using grid-to-point comparisons at different time scales, and hydrological modelling over the Mono River Basin, located in the Republics of Benin and Togo. To assess the suitability of the four products for flood purposes, a two-step approach was applied: (1) a satellite-only approach in which each product was used as input to the HBV-light hydrological model for runoff simulation, and (2) an observation-satellite approach in which gaps in observation data were filled using each product prior to the hydrological modelling. In all simulations, areal precipitation was derived with kriging before being input into HBV-light. On the one hand, the simulation with CHIRPS-only showed poor performance (NSE = -0.08 during calibration and -0.22 during validation), while the simulations with PERSIANN-only, TAMSAT-only, and GPCC-only yielded moderate performance, with NSE values ranging from 0.5 to 0.67. On the other hand, simulations with the observation-satellite combinations also showed moderate performances, with NSE values between 0.55 and 0.69, including for the observation-CHIRPS case.

The poor performance of the CHIRPS-only simulation, combined with the similar performance of all observation-satellite combinations, indicates that the quality of the satellite product used for gap filling plays a limited role. Moreover, the absence of significant improvement when using observation-satellite combinations compared to their satellite-only counterparts (except for CHIRPS) suggests that gap filling with satellite products does not necessarily enhance data quality. These results indicate that, in the Mono River Basin, gap filling may not be necessary when spatial interpolation methods such as kriging are applied.

How to cite: Houngue, N.: When More Data Is Not Better: Evaluating Satellite Rainfall Products in a Data-Scarce River Basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3102, https://doi.org/10.5194/egusphere-egu26-3102, 2026.

EGU26-3221 | Orals | HS3.8

Disentangling Sources of Uncertainty in Hydrologic Projections Using Multiple Climate Forcings, Bias-Correction Techniques, and Shared Socioeconomic Pathways 

Rocky Talchabhadel, Sunil Bista, Saurav Bhattarai, Subash Poudel, Amisha Bhandari, Sandhya Khanal, Aashish Gautam, Yogesh Bhattarai, Sanjib Sharma, and Nawa Raj Pradhan

Meteorological forcings under different climate scenarios exert substantial control over hydrologic-hydrological processes in watersheds and river systems. This study presents a comprehensive assessment of uncertainty in hydrologic projections by integrating a wide range of climate forcings, multiple bias-correction approaches, and several Shared Socioeconomic Pathways (SSPs). Specifically, we (i) quantify the total uncertainty in projected hydrologic responses, (ii) attribute uncertainty to individual sources, and (iii) examine how uncertainty propagates along the hydroclimatic modeling chain. The analysis is demonstrated for a range of watersheds using a fully calibrated Soil and Water Assessment Tool (SWAT) model. The hydrologic simulations are forced by outputs from thirty global climate models (GCMs) participating in the Coupled Model Intercomparison Project Phase 6 (CMIP6), obtained from the NASA Earth Exchange Global Daily Downscaled Projections (NEX-GDDP-CMIP6) dataset at a spatial resolution of 0.25° (~25 km) under two SSPs. To further refine the climate inputs, a linear bias-correction method is applied to daily temperature and precipitation time series to align long-term mean monthly values during the reference period (1985–2014) with PRISM observations. A total of four bias-correction scenarios are considered: (1) original NEX-GDDP-CMIP6 data, (2) precipitation-corrected data, (3) temperature-corrected data, and (4) jointly corrected temperature and precipitation data. This framework yields four forcing scenarios for each GCM–SSP combination, resulting in a total of 240 simulations (4 × 30 GCMs × 2 SSPs) for each watershed. Streamflow changes are evaluated for the near-future period (2031-2060) and far future period (2061-2090), relative to the historical baseline (1985-2014). Changes in probability distributions and cumulative distribution functions are analyzed across climate models, bias-correction methods, and SSPs. In addition, the relative contributions of individual uncertainty sources are quantified at monthly, seasonal, and annual time scales. By systematically accounting for uncertainties arising from climate forcings, bias-correction techniques, and socioeconomic pathways, this study provides a robust characterization of the range of plausible hydrologic futures. Such uncertainty-informed streamflow projections are essential for water-resources planning, flood and drought risk management, and the development of effective long-term water-management strategies.

How to cite: Talchabhadel, R., Bista, S., Bhattarai, S., Poudel, S., Bhandari, A., Khanal, S., Gautam, A., Bhattarai, Y., Sharma, S., and Pradhan, N. R.: Disentangling Sources of Uncertainty in Hydrologic Projections Using Multiple Climate Forcings, Bias-Correction Techniques, and Shared Socioeconomic Pathways, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3221, https://doi.org/10.5194/egusphere-egu26-3221, 2026.

EGU26-6637 | Posters on site | HS3.8

GeoAI-based augmentation of multi-source urban GIS 

Salem Benferhat, Nanée Chahinian, Carole Delenne, Ines Couso Blanco, Luciano Sanchez Ramos, and Zoltan Kato
This presentation addresses a major challenge: fully leveraging the potential of geospatial data to improve Geographic Information Systems (GIS). Using urban flooding as a case study, it aims to integrate heterogeneous data sources of varying nature and quality levels in order to enhance both the expressiveness and reliability of GIS.
 
This work presents ongoing and planned research activities within the ATLAS CHIST-ERA project, which is entirely dedicated to this objective through a multidisciplinary approach. The project mobilizes complementary expertise in GIS, artificial intelligence, machine learning, computer vision and 2D/3D image analysis and object detection, statistics, urban network mapping, as well as geoalignment techniques.
 
The presentation is structured around two main objectives, both oriented toward GIS enrichment, with direct applications for flood risk management.
 
The first objective consists of combining and integrating external data within GIS. This approach enables seamless data integration and facilitates the revision, completion, and enrichment of existing datasets, while improving their expressiveness, particularly through the introduction of 3D representations. Such enriched representations are essential for accurately modeling surface runoff, flow paths, and hydraulic connectivity in urban environments subject to flooding.
 
The second objective focuses on integrating imperfect or uncertain data, such as amateur videos, crowdsourced observations, or data lacking precise georeferencing. To address these limitations, the project relies notably on the use of variational autoencoders for processing imprecise data, and proposes uncertainty and imprecision management mechanisms aimed at improving data quality by reducing inaccuracies and explicitly modeling confidence levels.
 
Acknowledgments :
This work was supported by the CHIST-ERA project ATLAS "GeoAI-based augmentation of multi-source urban GIS" under grant numbers CHIST-ERA-23-MultiGIS-02 and ANR-24-CHR4-0005 (French National Research Agency).

How to cite: Benferhat, S., Chahinian, N., Delenne, C., Couso Blanco, I., Sanchez Ramos, L., and Kato, Z.: GeoAI-based augmentation of multi-source urban GIS, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6637, https://doi.org/10.5194/egusphere-egu26-6637, 2026.

EGU26-6933 | ECS | Orals | HS3.8

Integration and Alignment of Multiple Water Network Data Sources 

Omar Et-targuy, Carole Delenne, Salem Benferhat, and Ahlame Begdouri

Wastewater network management relies on geographic data from multiple sources, which creates significant integration challenges: spatial inconsistencies, incomplete coverage, and varying levels of precision.

Although different data sources may cover the same portion of the network, they are generally produced in different contexts or at different times. This can result in discrepancies in the descriptions of the physical infrastructure of the wastewater network: some elements may be accurately represented in one source but absent in another, while other objects may be described slightly differently across sources. Furthermore, for certain parts of the network, the structure itself may vary depending on the source. Consequently, any operation to merge datasets or build a global network representation requires matching the objects described by each source in order to identify those corresponding to the same physical element, to recognize objects present in multiple sources, and to distinguish those with no correspondence in other datasets.

In this work, we propose a data integration methodology to address disparities among these data sources and to match the various elements of wastewater networks. This approach establishes correspondences between multiple datasets representing the same infrastructure from different sources. By combining spatial and structural information, the method identifies matching components across datasets and produces a unified representation that leverages the complementary information from each source while resolving conflicts and inconsistencies.

The approach has been validated on real-world wastewater network data from multiple sources and covering different time periods. The results demonstrate high integration accuracy. This methodology enables a complete and consistent representation of wastewater networks, addressing the challenges of data heterogeneity inherent in multi-source infrastructure management.

How to cite: Et-targuy, O., Delenne, C., Benferhat, S., and Begdouri, A.: Integration and Alignment of Multiple Water Network Data Sources, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6933, https://doi.org/10.5194/egusphere-egu26-6933, 2026.

EGU26-7251 | Orals | HS3.8

 Cross-analysis of Multisource Data for Geolocation of Non-georeferenced Urban Infrastructure Data 

Thanh Ma, Salem Benferhat, Minh Thu Tran Nguyen, Nanée Chahinian, Carole Delenne, and Thanh-Nghi Do

Geographic Information Systems (GIS) are reference tools for representing, storing, analyzing, and visualizing geolocated data, particularly those related to urban infrastructures such as water networks. In addition to GIS reference data, there exists a significant amount of complementary data, referred to here as external data, generally produced in specific contexts such as urban network maintenance. When properly exploited, these external data sources, which are rich in information, can enhance GIS and help address the issue of missing data. However, these external data are often not geolocated, which makes their integration into GIS particularly complex.

The main objective of this work is to propose artificial intelligence–based methodologies to geolocate non-georeferenced external data, particularly maps related to urban water networks, by leveraging multisource data cross-analysis. The proposed approach relies on the joint exploitation of geolocated GIS data and external data lacking geolocation. It consists in analyzing maps using object detection techniques to extract characteristic elements, such as buildings or specific structures, which are then matched with corresponding entities available in the relevant GIS. By exploring different geographic areas of the same spatial extent as the maps and assessing the degree of similarity between the extracted elements and those referenced in the GIS, the method enables the identification of the most plausible area of correspondence and, ultimately, the geolocation of the maps in question.

This work addresses several major challenges in the context of geolocating external data using GIS data. The first challenge concerns the identification and selection of relevant elements capable of effectively guiding the search within available GIS. The second challenge lies in accounting for the sometimes limited reliability of object detection systems during the matching process. The third challenge involves defining appropriate similarity measures and selecting sufficiently discriminative elements for the matching process. Finally, the fourth challenge is algorithmic in nature, given that a map generally represents only a limited portion of a GIS, which raises issues similar to those encountered in large-scale matching approaches.

Acknowledgments :
This work was supported by the CHIST-ERA project ATLAS "GeoAI-based augmentation of multi-source urban GIS" under grant numbers CHIST-ERA-23-MultiGIS-02 and ANR-24-CHR4-0005 (French National Research Agency).

How to cite: Ma, T., Benferhat, S., Tran Nguyen, M. T., Chahinian, N., Delenne, C., and Do, T.-N.:  Cross-analysis of Multisource Data for Geolocation of Non-georeferenced Urban Infrastructure Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7251, https://doi.org/10.5194/egusphere-egu26-7251, 2026.

EGU26-7685 | ECS | Posters on site | HS3.8

Data assimilation to retrieve unknown bathymetry in shallow water model 

Flavien Baudu, Carole Delenne, Thibault Catry, Sophie Ricci, Ludovic Cassan, Vincent Herbreteau, and Renaud Hostache

Floods are among the most destructive and costly natural disasters. While risk assessment and management have helped mitigate their impact in recent decades, climate change is expected to increase both their frequency and severity. This underscores the urgent need for predictive tools to better anticipate and prevent the adverse effects of flooding. Two-dimensional Shallow-Water (SW) hydraulic models offer a reliable solution for flood prediction, providing critical information such as floodplain extent, water levels and flow velocities. However, these models require boundary conditions (such as input flows), precise topography and bathymetry (i.e. riverbed geometry) as well as parameters to be calibrated (such as terrain roughness.  Unfortunately, such data are often sparse or entirely unavailable in many regions due to the high cost and logistical challenges of in situ measurements. In particular, if the topography can be obtained using LiDAR acquisition of Numerical Terrain Models, the bathymetry remains unaccessible because LiDAR signal does not pass through the water surface.

In this context, Data Assimilation (DA)—a method that optimally combines uncertain models with observations—becomes particularly valuable for estimating such missing data or parameters. Our study proposes an innovative approach to reconstruct riverbed geometry by assimilating flood extent information derived from satellite imagery, specifically Synthetic Aperture Radar (SAR) data, which can reliably detect floodwater extents.

To account for observational uncertainty, we generate a probabilistic flood map from SAR images, where each pixel’s value represents its probability of being water, based on observed backscatter. Using a tempered particle filter (TPF), we assimilate multiple SAR-derived probabilistic flood maps into an ensemble of hydraulic simulations (referred to as "particles"). These simulations share the same model architecture but incorporate randomly sampled riverbed geometries. 

To evaluate our methodology, we conducted a synthetic twin experiment based on a real-world case study of the River Severn near Tewkesbury, UK—a region prone to frequent flooding. We first perform a hydraulic simulation (the "control run") using a reference riverbed geometry and realistic boundary conditions. From this simulation, we generate several synthetic probabilistic flood maps, which were then assimilated into a second simulation to estimate the riverbed geometry using the TPF.

Our results demonstrate the effectiveness of this approach: the estimated riverbed geometry closely matches the reference. Additionally, contingency maps reveal strong agreement between the flood extents predicted by the control run and those obtained through the DA experiment.

How to cite: Baudu, F., Delenne, C., Catry, T., Ricci, S., Cassan, L., Herbreteau, V., and Hostache, R.: Data assimilation to retrieve unknown bathymetry in shallow water model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7685, https://doi.org/10.5194/egusphere-egu26-7685, 2026.

EGU26-7760 | Posters on site | HS3.8

Anomaly detection in wastewater pipeline videos using self-attention 

Carole Delenne, Ti-Hon Nguyen, Minh-Thu Tran-Nguyen, and Salem Benferhat

Data related to urban infrastructures often come from multiple sources and exist in a wide variety of formats, such as Geographic Information Systems (GIS), textual information, numerical databases, images, or videos, which can make their processing, querying, and analysis complex. This work falls within this context and aims to propose new approaches for the management of heterogeneous data in stormwater and wastewater networks.

More specifically, we focus on video data, particularly Closed-Circuit Television (CCTV) inspection videos of sewer pipelines. These videos are essential for the management and maintenance of urban networks. On the one hand, they enable the identification of anomalies that may affect the integrity of pipelines, such as blockages or structural degradation. On the other hand, they provide key information on the structural properties of pipelines and networks, including pipe diameter and the direction of wastewater flow.

We propose a classification algorithm for wastewater inspection videos aimed at detecting major anomalies in CCTV inspection sequences of sewer networks, with a particular emphasis on identifying variations in pipe diameter, internal cracks, chemical corrosion, and the presence of turbid water within the pipelines. This task is crucial for predictive maintenance and hydraulic modeling of sewer systems. Information related to the identification of variations in pipe diameter can also be leveraged to enrich and complete missing pipe diameter attributes in Geographic Information Systems.

Our approach is based on the Video Vision Transformer (ViViT) and TimeSformer architectures, which effectively capture both spatial and temporal relationships in video data. We also describe various methodologies for generating training datasets from a subset of manually annotated images. Experimental results obtained on real-world CCTV sewer inspection videos provided by Montpellier Méditerranée Métropole demonstrate promising performance in anomaly detection.

How to cite: Delenne, C., Nguyen, T.-H., Tran-Nguyen, M.-T., and Benferhat, S.: Anomaly detection in wastewater pipeline videos using self-attention, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7760, https://doi.org/10.5194/egusphere-egu26-7760, 2026.

EGU26-7776 | ECS | Posters on site | HS3.8

CawSAR: an open-source framework for preprocessing hydroclimatic data in physically based hydrological modelling 

Tristan Bourgeois, Nicolas Flipo, Marie Pettenati, and Hervé Noel

Water resource management is a major challenge for the coming decades. Its effective application across diverse territories therefore relies on an accurate representation of hydrological processes, generally achieved through physically based distributed hydrological models which in turn depend on spatially consistent and representative hydroclimatic forcing. At regional scales, capturing local variability in hydroclimatic drivers (precipitation, temperature, evapotranspiration) often requires combining datasets with different spatial resolutions and methodological assumptions.

Within the Eau-SPRA project (ADEME, France 2030 Programme), the CaWaQS model (Flipo et al., 2022; Flipo et al., 2023) is applied to the Loire River basin to support socio-hydrological modelling from regional to local scales. CaWaQS is a coupled distributed surface–subsurface hydrological model simulating both river discharge and groundwater dynamics. It currently lacks an explicit snow representation, which can significantly affect hydrological dynamics across scales, particularly in large river basins such as the Loire and under climate change conditions (Valéry et al., 2014).

To address these challenges, we developed CawSAR (CaWaQS Snow Accounting Routine), an open-source Python-based preprocessing framework designed to harmonize multi-source climate data (e.g. reanalysis products, radar observations) over a target study area. Based on a 3D matrix representation (time, x, y) of climate fields, it integrates multiple functionalities within a single, reproducible workflow. Climate data are harmonized through systematic downscaling, upscaling and regridding performed on a grid-cell basis using physical external-drift adjustments (altimetric gradient). CawSAR also enables cross-comparison of climate data sources across different spatio-temporal scales and implements a degree-day snow model to compute snow accumulation and melt. Finally, it generates liquid input time series (sum of liquid rainfall and snowmelt) fully compatible with the CaWaQS core model, ensuring direct integration into hydrological simulations.

Applied to the Loire basin, CawSAR illustrates how physically based preprocessing and multi-source harmonization enhance hydroclimatic forcing consistency for regional-scale hydrological modelling.

How to cite: Bourgeois, T., Flipo, N., Pettenati, M., and Noel, H.: CawSAR: an open-source framework for preprocessing hydroclimatic data in physically based hydrological modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7776, https://doi.org/10.5194/egusphere-egu26-7776, 2026.

EGU26-9507 | ECS | Posters on site | HS3.8

From Imperfect Sewer Data to Coherent Topology: A Graph-Based Approach  

Batoul Haydar, Nanée Chahinian, and Claude Pasquier

Urban sewer networks are critical infrastructures that support residents' everyday life and ensure the collection and transportation of wastewater and stormwater. Yet operational datasets describing these networks are frequently imperfect: pipes may be missing, connectivity may be fragmented, and flow direction may be inconsistent due to incomplete attributes (e.g., invert levels, slope) or digitizing errors. We present a topology-focused study that transforms sewer data into a directed network by combining (i) graph-based representation and (ii) geometry-based consistency checks and rules. Starting from a directed (multi)graph built from available pipe and node geometries, which represent the edges and nodes in the graph, we detect topological anomalies including disconnected components, missing connections, dead ends, and closed loops.

When two pipes converge at a manhole with no outgoing pipe, it forms a non-outlet sink. To resolve this, we apply a two-stage methodology: edge orientation to reduce flow inconsistencies and resolve any sink nodes, followed by targeted edge addition to reconnect remaining disconnected components when reversals alone are insufficient. We test feasibility of the approach on a large open-access urban sewer dataset. The results illustrate how topology-oriented methods can still be applied to establish a well-connected network when data attributes are missing or unreliable.

How to cite: Haydar, B., Chahinian, N., and Pasquier, C.: From Imperfect Sewer Data to Coherent Topology: A Graph-Based Approach , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9507, https://doi.org/10.5194/egusphere-egu26-9507, 2026.

EGU26-11107 | Orals | HS3.8

The effects of droughts on pumping fields at the watershed scale: building a model from a heterogeneous dataset. 

Jordan Labbe, Hélène Celle, Julie Albaric, Pierre Nevers, Gilles Mailhot, Jean-Luc Devidal, and Nathalie Nicolau

Water management is becoming an increasingly complex task that must account for not only climate change but socio-economic pressures as well. This is particularly true in the case of alluvial aquifers which are often connected to surface waters, thus requiring a watershed scale policy. Conflicts of use might emerge especially during droughts which are occurring more frequently. In this context, the alluvial aquifer of the Allier River (France) is an interesting case study. This is a major regional resource for drinking water, industries and irrigation which extends over 210 km long between Langeac and the confluence with the Loire River. The Naussac dam keeps the Allier River at a minimum flow rate and secures water uses downstream, but the summer drought of 2023 was extreme and the dam was almost completely emptied. If this situation were to repeat itself over a longer period, the consequences on the productivity of pumping fields implanted on the alluvial aquifer are unknown. This work is part of the MODALL² project in which we propose to build a transient model of the alluvial aquifer using MODFLOW (Groundwater Vistas 8). One of the main challenges is to gather and organize a set of often heterogeneous data (incomplete time series, spatial data sparsely distributed etc.) from various sources. With the intention of improving the existing network, 50 additional water loggers have been deployed for groundwater level monitoring. 30 Electrical Resistivity Tomography (ERT) profiles were carried out to refine the thickness of alluvial deposits on the well-fields and thus, the geometry of the model. Given the elongated dimension of the alluvial aquifer, the study area is divided into 9 sub-models with which a ‘cascade modelling’ is performed. The purpose is to better understand how droughts spread across the whole hydrosystem and to what extent the pumping fields will be affected. ERT surveys have revealed that the thickness of alluvial deposits varies significantly from one site to another, ranging from 5 to 15 m downstream where the alluvial plain is more widespread. Hydrodynamic data show the influence of the river on groundwater level variations depending on the distance from the river. Lastly, the heterogeneity of the input datasets introduces uncertainty into the model that will need to be estimated. Beyond the potential to use modeling to anticipate future water crises, this work also proposes a methodology for handling large-scale heterogeneous datasets.

How to cite: Labbe, J., Celle, H., Albaric, J., Nevers, P., Mailhot, G., Devidal, J.-L., and Nicolau, N.: The effects of droughts on pumping fields at the watershed scale: building a model from a heterogeneous dataset., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11107, https://doi.org/10.5194/egusphere-egu26-11107, 2026.

EGU26-11579 | ECS | Orals | HS3.8

A comparative benchmark of tabular, sequential, and graph-based models for well-log imputation 

Wendinkonté Fabrice Cédric Sawadogo, Romain Chassagne, and Olivier Atteia

Well-log datasets commonly contain missing values due to acquisition issues, operational constraints, and economic limitations, which complicate quantitative subsurface analysis and useful extraction of information in geothermal and more largely subsurface characterisation. Imputation is therefore a key preprocessing step, yet many existing approaches primarily focus on within-well continuity and treat the problem as a depth-wise or time-series task, often overlooking spatial redundancy between neighbouring wells.

In this contribution, we compare three complementary modeling paradigms for well-log imputation: tabular machine-learning methods, sequential deep-learning models, and spatially informed graph-based approaches. The comparison is conducted within a unified and reproducible experimental framework based on cross-well validation and realistic missingness scenarios, including isolated gaps as well as extended block-wise and complete log-wise gaps.

Results highlight clear differences in behaviour across modeling families. Tabular methods exhibit limited robustness when missing values become structured, while sequential models improve depth-wise continuity but remain sensitive to large gaps and absent logs. In contrast, spatially informed graph-based models show increased stability by exploiting inter-well relationships, leading to more coherent reconstructions at the field scale.

These results suggest that evaluating imputation quality solely through local error metrics is insufficient for realistic subsurface applications. By emphasizing the importance of spatial coherence and inter-well information, this study supports the use of spatially aware formulations as a valuable alternative to purely depth-wise approaches for geothermal and broader subsurface characterization workflows.

How to cite: Sawadogo, W. F. C., Chassagne, R., and Atteia, O.: A comparative benchmark of tabular, sequential, and graph-based models for well-log imputation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11579, https://doi.org/10.5194/egusphere-egu26-11579, 2026.

EGU26-11636 | ECS | Orals | HS3.8

Unsupervised pattern recognition for imperfect datasets: a visual workflow for plausibility checks and regime diagnosis in high-dimensional environmental time series 

Kenneth Gutiérrez, Gunnar Lischeid, Gökben Demir, Maren Dubbert, Alexander Knohl, and Christian Markwitz

Data imperfection is characterized by fragmentation, sensor failures, and high-dimensional noise. This remains a persistent challenge in environmental monitoring. As observation networks expand to capture heterogeneous soil-atmosphere interactions, traditional quality control methods based on rigid statistical thresholds often struggle to distinguish between sensor errors and genuine, non-linear system dynamics. This study presents a methodological development for knowledge extraction from imperfect and fragmented data, employing a multivariate visualization workflow that combines Principal Component Analysis (PCA) and Self-Organizing Maps (SOM) with Sammon Mapping.

We applied this unsupervised learning approach to a high-dimensional dataset (~100 variables) from a field-scale agricultural system, including measurements of soil moisture and temperature, eddy covariance-derived CO2, energy fluxes, radiation, wind, precipitation, groundwater level and discharge.

This allowed us to compare a discontinuous period in 2024 against a continuous period in 2025. The results demonstrate the method's robustness in extracting coherent structural patterns despite data incompleteness. While PCA effectively isolated the dominant thermodynamic baselines from high-frequency hydrologic events, the topological SOM projection provided a rapid, visual plausibility check.

The visualization facilitated the identification of possible irregularities in the sensors as spatial outliers in the 2024 dataset, facilitating instant anomaly detection without manual time-series inspection. Furthermore, the method successfully captured shifts in system dynamics, such as the decoupling of surface moisture from groundwater, validating its utility for identifying physical regimes in heterogeneous data. We conclude that this visual workflow offers a scalable, data-driven solution for moving from raw, imperfect observations toward actionable system diagnostics, bridging the gap between data acquisition and process understanding in complex environmental observatories.

How to cite: Gutiérrez, K., Lischeid, G., Demir, G., Dubbert, M., Knohl, A., and Markwitz, C.: Unsupervised pattern recognition for imperfect datasets: a visual workflow for plausibility checks and regime diagnosis in high-dimensional environmental time series, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11636, https://doi.org/10.5194/egusphere-egu26-11636, 2026.

EGU26-11928 | Posters on site | HS3.8

A Disjunctive Interpretation Approach to Missing Data Based on Clustering Quality 

hamza khyari and Salem Benferhat

Data completion is a major challenge in many applications, particularly in Geographic Information Systems (GIS) for water networks. Numerous approaches have been proposed to address this problem, ranging from classical statistical methods to artificial intelligence-based techniques.

In this presentation, we address the problem of missing or imprecise data in water network GIS by proposing a clustering-based data completion approach. For a given attribute with missing or uncertain values, each possible value in the attribute domain is considered as a candidate for completion. Each candidate is evaluated by analyzing its impact on the clustering of the entire dataset: inserting a candidate value induces a specific global clustering, whose quality is assessed using appropriate clustering validity criteria. The value that yields the highest-quality clustering, namely the one that best captures the intrinsic structure of the data, is selected as the final completion value.

To cope with the combinatorial explosion resulting from multiple attributes with missing values and large domains, several strategies are employed to reduce the number of candidate completions, including aggregation mechanisms, while maintaining both the effectiveness and efficiency of the proposed approach.

How to cite: khyari, H. and Benferhat, S.: A Disjunctive Interpretation Approach to Missing Data Based on Clustering Quality, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11928, https://doi.org/10.5194/egusphere-egu26-11928, 2026.

EGU26-14633 | Posters on site | HS3.8

Composing Transparent Quality Control Pipelines from Basic Anomaly Descriptions 

Peter Lünenschloß, David Schaefer, and Jan Bumberger

Quality control (QC) and data cleaning remain major bottlenecks in geoscientific data analysis as data volumes, dimensionality, and heterogeneity continue to increase. While machine- and deep-learning-based approaches have demonstrated impressive performance in selected applications, their practical adoption is often constrained by the availability of sufficiently large labelled training datasets and by the effort required to calibrate and adapt model hyperparameters across datasets and domains, particularly in unsupervised flagging scenarios. Conversely, rule-based, deterministic, and statistical QC approaches offer greater transparency and interpretability, but are frequently tailored to specific data structures and lack the flexibility required to robustly generalise to varying observational contexts and non-ideal data distributions.

We present a software framework that addresses this gap by enabling the formulation of QC pipelines in terms of a small set of basic anomaly descriptions, such as outliers, noisy regimes, and data gaps. These anomaly notions are intuitively understood by domain experts, while their systematic combination allows the representation of a wide range of anomaly patterns encountered in geoscientific observations.

The parameters of these compositions are then automatically calibrated with the data at hand, resulting in an instantiated QC pipeline. By internally reducing the calibration problem to the fitting of individual anomaly descriptions defined by only a small number of well-understood parameters, the optimisation achieves robust convergence even with a limited number of supervised examples. Within the framework, such examples can be generated interactively during pipeline construction by domain specialists themselves or imported from existing sources. This design lowers the entry barrier for effective automated quality control while enabling the explicit integration of domain knowledge into the calibration process.

The framework is implemented as a new module within the open-source quality-control software SaQC, thereby integrating seamlessly with existing data import, preprocessing, and flag management workflows. Calibrated QC pipelines can be exported and stored as portable, human-readable configuration files in a tabular format. These configurations can subsequently be loaded and applied using the SaQC application to new and unseen datasets, enabling reproducible and automated quality control.

In the poster, we present the conceptual design of the framework and demonstrate its application to a hydrological dataset, highlighting the transparent, combinatorial configuration interface and the integrated supervision workflow.

 

How to cite: Lünenschloß, P., Schaefer, D., and Bumberger, J.: Composing Transparent Quality Control Pipelines from Basic Anomaly Descriptions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14633, https://doi.org/10.5194/egusphere-egu26-14633, 2026.

EGU26-14708 | ECS | Orals | HS3.8

A Knowledge Graph–Based Approach for reconciling Geological information 

Imadeddine laouici and Fatma Chamekh

The understanding of the subsurface relies on integrating heterogeneous geological information originating from geological maps, geological models, and textual sources such as reports and scientific publications. In current practice, these sources remain rarely homogenized and are reconciled manually by domain experts, mostly in the context of 3D geomodel construction projects. Even when information is reconciled, existing methods offer limited support for expert knowledge integration, traceability of interpretations, and automated wholistic consistency checking.

We propose SemTrack, an ontology-based integration approach designed to formalize, reconcile, and exploit multi-source geological information within a unified knowledge graph. In this framework, SemTrack integrates structured information extracted from maps and numerical geological models with unstructured knowledge derived from textual documents, all aligned through a dedicated modeling ontology. The resulting knowledge graph supports logical reasoning and knowledge inference using SWRL rules to ensure the consistency of geological constraints and allows to explicitly encode expert interpretations record. This enables the automation of conceptual inconsistencies detection, transparent inference of implicit geological relationships, the completion of missing information across multiple sources, and advanced complex querying of initially heterogenous geological information.

How to cite: laouici, I. and Chamekh, F.: A Knowledge Graph–Based Approach for reconciling Geological information, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14708, https://doi.org/10.5194/egusphere-egu26-14708, 2026.

EGU26-15597 | Posters on site | HS3.8

Quality Control of Redundant Water Level Gauges in South Korea River Gauging Stations 

TaeWoong Ok, ChiYoung Kim, KiYong Kim, and ChanWoo Kim

In South Korea, river stage gauging stations operate redundant water level gauges to mitigate instrument malfunctions and anomalous measurements. Currently, redundant gauges are installed at over 60% of gauging stations, reflecting their widespread implementation; however, their quality management and practical utilization remain limited. In many cases, installation and operational conditions are not fully accounted for in observed water levels, leading to significant discrepancies between primary and redundant gauges. These discrepancies may arise from river characteristics, artificial configuration errors, or site-specific conditions.

 

This study investigates the causes of discrepancies between primary and redundant gauges and proposes appropriate correction methods. Anomaly detection was first conducted on redundant gauge measurements using limit tests, duration tests, and regression tests to ensure data reliability. Based on this, the relationships between primary and redundant gauge readings were analyzed using simple regression, multiple regression, and nonparametric LOESS (Locally Estimated Scatterplot Smoothing) regression. These procedures not only facilitated the derivation of site-specific correction methods but also supported the preliminary development of a real-time quality control program, moving beyond conventional manual, non-real-time quality management.

 

Nevertheless, because the causes of discrepancies and installation conditions vary by site, site-specific correction strategies are required, and ongoing monitoring and refinement of measurements and corrections remain necessary. Furthermore, real-time utilization of redundant gauges is challenging at newly established stations. Despite these limitations, the proposed correction strategies have the potential to go beyond simple substitution of primary gauge readings, enabling higher-quality hydrological data production and improved quality control. These strategies are expected to enhance real-time hydrological monitoring systems and strengthen the reliability of national hydrological data management frameworks.

Keywords : Redundant, Water Level Gauging, Uncertainty, Operational Monitoring

 

Acknowledgements

This work was supported by Korea Environment Industry & Technology Institute(KEITI) through Research and Development on the Technology for Securing the Water Resources Stability in Response to Future Change Project, funded by Korea Ministry of Climate, Energy, Environment(MCEE)(RS-2024-00332300).

How to cite: Ok, T., Kim, C., Kim, K., and Kim, C.: Quality Control of Redundant Water Level Gauges in South Korea River Gauging Stations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15597, https://doi.org/10.5194/egusphere-egu26-15597, 2026.

EGU26-15843 | Posters on site | HS3.8

Comparative Evaluation of Daily Streamflow Gap-Filling Using Paired Upstream–Downstream Gauges 

Chi Young Kim, Chanwoo Kim, and Taewoong Ok

Complete daily streamflow time series are essential for sustainable water resources management and reliable hydrological modelling; however, even short data gaps can substantially reduce the usability of streamflow records. Recurrent missing data may lead to inefficient model calibration, decreased reliability of peak and low-flow estimates, and biased hydrological statistics. Therefore, rather than leaving missing values unfilled, it can be beneficial to infill daily streamflow using appropriate methods and to provide flags indicating imputed periods. 
In South Korea, streamflow monitoring prior to 2008 primarily focused on flood-related observations, resulting in relatively limited daily streamflow records; since then, the production of continuous daily streamflow data for water resources management has expanded. As of 2024, daily streamflow records from more than 420 gauging stations are managed and disseminated, yet a non-negligible number of stations still contain missing values due to various causes such as river works and uncertainties in stage–discharge relationships associated with the operation of hydraulic structures. 
This study comparatively evaluates gap-filling techniques using paired upstream–downstream gauging stations located in basins with diverse rainfall regimes and hydrological characteristics. We assess conventional methods widely used in practice (scaling, linear regression, and equi-percentile/quantile-based approaches) under different missing-data conditions and benchmark them against an extended long short-term memory (extended LSTM) time-series model designed for streamflow infilling. Performance is evaluated using the Nash–Sutcliffe efficiency (NSE), root mean square error (RMSE), and percent bias (PBIAS). In addition, flow duration curves (FDCs) are compared to examine each method’s ability to reproduce the post-infilling flow regime distribution. The outcomes are expected to support condition-dependent selection of gap-filling strategies and to improve the reliability of daily streamflow datasets with explicit quality flags.

How to cite: Kim, C. Y., Kim, C., and Ok, T.: Comparative Evaluation of Daily Streamflow Gap-Filling Using Paired Upstream–Downstream Gauges, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15843, https://doi.org/10.5194/egusphere-egu26-15843, 2026.

EGU26-17357 | Posters on site | HS3.8

A Preliminary Analysis of High Water Events in Venice Based on Multi-Decadal Observations and Clustering 

Franco Alberto Cardillo, Angela Andrigo, Francesco De Biasio, Franca Debole, Marco Favaro, Alvise Papa, Umberto Straccia, and Stefano Vignudelli

High water events in Venice are a recurrent phenomenon, as the city is located only slightly above mean sea level and is directly influenced by water-level variations within the lagoon. Flooding occurs when several physical processes act in combination. The astronomical tide determines the baseline water level, which is subsequently modulated by seiche oscillations in the Adriatic Sea, meteorological forcing (e.g. wind stress and atmospheric pressure), and slower, low-frequency geophysical processes and sea level rise. When these factors co-occur, even if individually moderate, large portions of the city may experience flooding.

Repeated flooding has significant economic and social impacts, limits pedestrian and naval traffic and contributes to the degradation of buildings and cultural heritage. To mitigate these effects, a range of protective measures is implemented and coordinated by an early warning system. The effectiveness of these measures depends on their timely activation. However, mitigation actions are associated with substantial economic costs and may themselves generate negative impacts if deployed unnecessarily. For instance, interruptions to public transport services affect daily activities, while the operation of the MOSE barrier entails considerable financial costs. Accurate and reliable forecasts are therefore essential to balance flood protection with the economic and social costs of mitigation measures.

Current forecasting systems primarily estimate water levels and peak values, and these are typically estimated at a limited number of locations. These systems are based on sophisticated statistical and hydrodynamic models. Although they perform well in most situations, their accuracy can be affected by uncertainties in atmospheric forcing and by limitations in representing the full variability of high water events. This work explores the potential of complementary approaches based on the analysis of observational data rather than explicit physical modelling.

Data-driven approaches, in particular Machine Learning (ML) methods, analyze historical data without relying on predefined, human-designed model structures. ML models are able to capture recurring patterns and complex feature interactions that are difficult to incorporate into traditional numerical models. Among these approaches, clustering techniques aim to identify recurrent types of events based on similarities in their temporal evolution and associated meteorological conditions. This enables events characterized by similar water levels to be differentiated according to the combinations of underlying meteorological drivers, thereby providing additional information to support forecasting and response planning.

In this work, we present a preliminary analysis based on several clustering approaches, including k-means, DBSCAN, and deep learning–based methods, applied to a multi-decadal atmospheric dataset and to the longest available reconstructed hourly sea-level records for the northern Adriatic Sea, specifically developed for this study. We compare the resulting event classifications and discuss how cluster-derived information may complement existing forecasting systems in support of flood-mitigation strategies for the city of Venice.

How to cite: Cardillo, F. A., Andrigo, A., De Biasio, F., Debole, F., Favaro, M., Papa, A., Straccia, U., and Vignudelli, S.: A Preliminary Analysis of High Water Events in Venice Based on Multi-Decadal Observations and Clustering, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17357, https://doi.org/10.5194/egusphere-egu26-17357, 2026.

Managing multi-source data requires flexible approaches and tools to model many types of imperfections surrounding them and, and more braodly, to deal with uncertainties of multiple origins, namely aleatory (representing randomness) and epistemic uncertainty (related to imperfect knowledge). While the first origin can be adequately represented using classical probabilities, there is no simple, single answer for epistemic uncertainty. New theories of uncertainty based on "imprecise probabilities" have been developed in the literature to go beyond the systematic use of a single probabilistic law. In this communication, I analyze the application of these methods for quantifying uncertainty in various real-world cases of natural hazard assessment (earthquakes, floods, rockfalls) in terms of their advantages and disadvantages compared to the traditional probabilistic approach. On this basis, I draw lessons to support decision making under uncertainty and identify open questions and remaining challenges, in particular the integration of spatio-temporal geodata, the use of full process high-fidelity numerical models, and interfacing with AI-based approaches.

I acknowledge financial support of the French National Research Agency within the HOUSES project (grant N°ANR-22-CE56-0006).

How to cite: Rohmer, J.: Dealing with imperfect knowledge in natural hazard assessments: beyond classical probabilities and challenges, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17687, https://doi.org/10.5194/egusphere-egu26-17687, 2026.

EGU26-19993 | ECS | Orals | HS3.8

Uncertainty produced in a 15-minute gridded rainfall product for the UK.  

Tom Keel, Matt Fry, and Sam Counsell

Reliable rainfall datasets are an essential foundation for hydrological research. The most extensive rainfall information is collected from rain gauge networks, which provide high-frequency observations on rainfall intensity at those locations, or their data can be interpolated onto a regular grid to provide consistent region-wide estimates.

For the UK, there are two major daily gridded rainfall products: (1) CEH-GEAR developed by the UK Centre for Ecology & Hydrology, and (2) HadUK-Grid developed by the Met Office. In each case, they are built from a selection of rain gauges from a multi-nation rain gauge network spanning Great Britain. Decisions made at each stage of rainfall data preparation, about collection, formatting, quality control and then gridding, introduce uncertainty into the resulting gridded rainfall products.

In this talk, we discuss plans for CEH-GEAR 15 min, a new sub-daily 1 km product developed as part of the UK’s multi-year Flood & Drought Research Infrastructure (FDRI) project. We detail each step of its production, from raw rain gauge to gridded rainfall estimates, and systematically discuss the sources of uncertainty introduced at each stage. 15-minute rainfall measurements tend to be highly variable in space and time, and intense storms or long dry periods create practical challenges for preparing gridded rainfall estimates. So, we quantify the sensitivity of those estimates to decisions made about quality control and data blending during notable rain events across the UK. We also present the associated open-source tools developed as part of FDRI, including RainfallQC, that aim to support reproducible rainfall data processing and alleviate some of the challenges in sub-daily rainfall data preparation.

How to cite: Keel, T., Fry, M., and Counsell, S.: Uncertainty produced in a 15-minute gridded rainfall product for the UK. , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19993, https://doi.org/10.5194/egusphere-egu26-19993, 2026.

EGU26-23039 | Posters on site | HS3.8

Managing Incomplete Urban Reference Data for Risk-Oriented Geoscience Applications: Lessons from the CERES Project 

Gracianne Cécile, Youssef Fouzai, Mirga Bokidingo, Caterina Negulescu, Yves Lucas, Gilles Grandjean, and Fatima Chamekh

Assessing exposure and vulnerability to natural hazards increasingly relies on national geospatial reference datasets. However, these datasets are often incomplete, heterogeneous and inconsistent across spatial scales, which limits their direct usability for multi-hazard risk analysis. In France, the BD TOPO building database exemplifies these challenges, with a large share of buildings lacking key attributes such as usage type, despite their importance for vulnerability assessment.
This contribution presents the approach developed within the CERES project (Cartography and Characterization of Exposed Elements from Satellite Imagery) to address reference data incompleteness and multi-source integration challenges in a geoscience risk context. Focusing on a large study area in the Centre-Val de Loire region, we first quantify and analyze the spatial and semantic gaps of BD TOPO building attributes, showing that more than 40% of buildings are labelled with unknown usage. We then demonstrate how deep learning applied to very high-resolution aerial imagery can be used to probabilistically infer missing semantic information, significantly reducing uncertainty while explicitly accounting for classification ambiguities.
Beyond data completion, we highlight the difficulties encountered when jointly exploiting heterogeneous datasets originating from national mapping agencies, land cover products, socio-economic statistics and hazard layers. These include spatial misalignments, inconsistent scales of representation, varying levels of reliability, and the absence of a shared data model. To address these issues, CERES proposes a multi-scale data structuring framework combining data modelling and processing designed to preserve data provenance, uncertainty and semantic traceability across sources.
By articulating reference data analysis, machine-learning-based enrichment and database design, this work provides a concrete illustration of current practices and challenges in managing imperfect geospatial data for geoscience applications. The results underline the necessity of coupling data-driven approaches with explicit data governance and modelling strategies to produce robust, transparent and reusable datasets for territorial risk assessment.

How to cite: Cécile, G., Fouzai, Y., Bokidingo, M., Negulescu, C., Lucas, Y., Grandjean, G., and Chamekh, F.: Managing Incomplete Urban Reference Data for Risk-Oriented Geoscience Applications: Lessons from the CERES Project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23039, https://doi.org/10.5194/egusphere-egu26-23039, 2026.

EGU26-304 | ECS | Orals | GI2.1

Detecting Fin Whale Calls from Ocean-Bottom Seismometer Data with Deep Learning 

Jocelyn Japnanto, Alex Saoulis, Miriam Romagosa, Rita Leitão, Mónica A. Silva, Matt Graham, and Ana M. G. Ferreira

Fin whales (Balaenoptera physalus) produce low-frequency vocalisations that propagate efficiently through the ocean and seafloor, making them detectable on broadband ocean bottom seismometers (OBS). While primarily deployed for seismic studies, OBSs offer a unique and cost-effective opportunity for passive acoustic monitoring (PAM) of marine mammals in remote regions over extended periods. Traditional detection and classification of whale calls have relied on energy thresholding, cross-correlation, or matched filtering techniques. These approaches, however, may falter in performance in high-noise environments typical of OBS datasets and often require extensive manual post-processing, making them a labour-intensive process. These limitations motivate automated, noise-robust approaches capable of exploiting the growing volume of seismic data now available.

We present a deep learning framework for detecting fin whale calls from broadband OBSs surrounding the São Jorge Island in the Azores, as well as up to twenty stations of the wider UPFLOW array spanning the Azores–Madeira–Canaries region. Our method uses a semantic segmentation model that operates on spectrogram representations between 12–35 Hz, a frequency band encompassing the classic ‘20-Hz’ fin whale note and the lower frequency ‘backbeat’. The model architecture includes a ResNet-18 encoder pretrained on ImageNet with a U-Net decoder to identify calls in both time and frequency. Training was conducted on a dataset comprising of ~6 days of manually annotated spectrograms and an additional ~6 days of background-only spectrograms. Performance was evaluated using mean Intersection-over-Union and F1-score, achieving 0.65 and 0.80 respectively.

Once validated, the model was applied to months- to year-long OBS records across the region. Fin whale calls were detected at all stations, with clear seasonal patterns showing peak calling activity between October and February, consistent with known migratory patterns in the North Atlantic. Spatial differences in call characteristics and temporal patterns further revealed potential regional variations in vocal behaviour, offering insights into song plasticity and complexity.

By applying a deep learning-based detector on OBS data, we show that machine learning provides a powerful and efficient approach to automating fin whale call detection at scale. Our method processed hundreds of thousands of hours of OBS recordings and identified nearly a million calls across all stations. This large-scale detection unlocks detailed analyses of vocal behaviour, spatial distribution, and seasonal trends, deepening our understanding of their behaviour in the north-east Atlantic. Our findings not only highlight the interdisciplinary value of OBS datasets, but also the potential of machine learning in supporting PAM efforts for the conservation and management of wide-ranging marine species.

How to cite: Japnanto, J., Saoulis, A., Romagosa, M., Leitão, R., Silva, M. A., Graham, M., and Ferreira, A. M. G.: Detecting Fin Whale Calls from Ocean-Bottom Seismometer Data with Deep Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-304, https://doi.org/10.5194/egusphere-egu26-304, 2026.

EGU26-716 * | ECS | Orals | GI2.1 | Highlight

An Integrated Digital Framework for Multi-Scale Water Security in Africa. 

Samuel Berchie Morfo and Nana Kwame Osei Bamfo

This presentation outlines a comprehensive framework of multi-scale digital solutions designed to address Africa's pressing water challenges. We explore the integration of advanced physical modelling with a diverse suite of next-generation hydrologic observations from remote sensing and in-situ networks to crowd-sourced data. The core of our approach lies in automated systems for data fusion, processing, and assimilation, leveraging machine learning and hybrid techniques to enhance model accuracy. Critically, we incorporate robust uncertainty quantification to ensure reliable outputs. These integrated components enable the development of actionable, real-time forecasting and decision support systems for water resources allocation and disaster management. We will demonstrate practical applications, including autonomous processes and embedded devices, showcasing a transformative pathway towards proactive, data-driven water governance across the African continent.

How to cite: Berchie Morfo, S. and Bamfo, N. K. O.: An Integrated Digital Framework for Multi-Scale Water Security in Africa., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-716, https://doi.org/10.5194/egusphere-egu26-716, 2026.

The Gross Calorific Value (GCV) indicates coal quality by measuring the total heat released during the complete combustion of the coal. Accurate GCV estimation is crucial for efficient pricing, processing, and energy performance assessment in industries. Conventional oxygen bomb calorimetry, though precise, is relatively slow and expensive for large-scale analyses. Since coal’s organic and elemental composition strongly affects its heating value, understanding this relationship can help with reliable GCV evaluation. In this study, we analyzed the mid-infrared FTIR spectra of coal and selected 56 absorption bands associated with the relevant organic and elemental constituents of coal. These were used as input features for various machine learning (ML) models to predict the GCV of coal from the Johilla coal basin in India. The ML models tested included piecewise linear regression (PLR), partial least squares regression (PLSR), support vector regression (SVR), random forest regression (RFR), artificial neural networks (ANN), and extreme gradient boosting regression (XGB). By combining the predictions from the three models (PLSR, RFR, and XGB) through a simple average, we achieved the highest accuracy (R² = 0.951, RMSE = 19.05%, MBE = 1.42%, MAE = 4.053 cal/g), indicating strong agreement between the predicted and measured values. Overall, the FTIR-based method yields results that match or surpass those of traditional laboratory techniques reported in earlier research. The GCV values predicted from the FTIR models were statistically tested using t-tests (test for mean) and F-tests (test for variance) at a 1% significance level and were found to be statistically similar to the results from the standard bomb calorimeter method. The study demonstrates that the FTIR-based approach is independent and reliable and can be used as a faster and more convenient alternative method for determining GCV, making it highly useful for quick coal quality analysis in industry.

How to cite: Vinod, A., Prasad, A. K., and Varma, A. K.: A novel method for rapid and reliable estimation of Gross Calorific Value (GCV) of Coal using mid-infrared FTIR Spectroscopy and a multi-model Machine Learning Approach, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1075, https://doi.org/10.5194/egusphere-egu26-1075, 2026.

EGU26-1760 | ECS | Orals | GI2.1

Toward Robust Three-Dimensional Magnetic and Gravity Inversion Using Deep Learning 

Shiva Tirdad, Gilles Bellefleur, Fidele Yrro, Mojtaba Bavand Savadkoohi, and Erwan Gloaguen

Magnetic and gravity surveys remain among the most cost-effective geophysical tools for investigating the subsurface. They provide information on rock geometry and bulk properties at regional to deposit scale, and they have long been used to guide mineral exploration. However, turning geophysical anomalies into reliable three-dimensional property models requires inversion, a process that is inherently non-unique: multiple subsurface distributions can explain the same anomaly. Conventional approaches, such as least-squares or Bayesian inversion, can produce valuable results; however, they remain computationally demanding for large 3D models and require strong regularization choices that may bias geological interpretation.
Over the last decade, geoscientists have explored machine learning as an alternative approach. Instead of repeatedly solving forward equations, machine learning methods learn a mapping between geophysical anomalies and subsurface properties using large training libraries of synthetic examples. Early work with convolutional neural networks (CNNs) and U-Net architectures showed the concept is viable for electromagnetic and seismic data. More recent studies have shown that deep neural networks can recover magnetic susceptibility distributions from magnetic data and, in some cases, perform joint inversion of gravity and magnetic observations. Nevertheless, purely convolutional architectures often struggle to preserve long-range spatial relationships in fully three-dimensional volumes, resulting in blurred boundaries and reduced geological interpretability.
Recent advances in deep learning offer new opportunities to address these limitations. Emerging models are designed to capture long-range dependencies and preserve sharper boundaries. They have been effective in other 3D volumetric fields, such as medical imaging and seismic interpretation, but have yet to be explored for potential-field inversion.
In this study, we develop a deep-learning-based inversion method for magnetic and gravity data aimed at critical mineral exploration. The approach targets mineral systems with distinct geophysical signatures, with a focus on volcanogenic massive sulfide (VMS) environments. By combining data-driven learning with physics-informed training, the method produces reproducible three-dimensional susceptibility and density models that reduce ambiguity in subsurface interpretation. The workflow is tested using data from the Flin Flon VMS district in Manitoba, Canada, demonstrating its potential to improve targeting of buried copper-zinc mineralization and to support the integration of advanced AI methods into geoscience workflows.

 

How to cite: Tirdad, S., Bellefleur, G., Yrro, F., Bavand Savadkoohi, M., and Gloaguen, E.: Toward Robust Three-Dimensional Magnetic and Gravity Inversion Using Deep Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1760, https://doi.org/10.5194/egusphere-egu26-1760, 2026.

EGU26-3405 | Orals | GI2.1

SatWellMCQ: A Vision–Language Satellite Datasetfor MCQ-Based Image Grounding of Oil Wells 

Ahmed Emam, Sultan Alrowili, Mathan K. Eswaran, Romeo Kinzler, and Younes Samih

Monitoring oil and gas wells is essential for assessing environmental degradation and long-term impacts such as methane emissions from abandoned and orphaned wells. Satellite imagery combined with machine learning offers scalable capabilities for detecting and characterizing oil and gas infrastructure, yet progress remains constrained by the lack of multimodal, multiple-choice (MCQ) vision-language datasets that enable structured evaluation and post-training of vision-language models (VLMs) for oil well scene grounding. Existing resources are predominantly visual-only and therefore provide limited support for image grounding from satellite imagery.

To address this gap, we introduce SatWellMCQ, a vision-language dataset of expert-verified satellite imagery paired with natural-language descriptions and multiple-choice supervision for image-grounded identification and localization of oil wells. SatWellMCQ uses high-resolution multispectral Planet imagery (RGB and infrared) and text annotations that describe well type and spatial context. Each sample includes one expert-verified correct description and three semantically plausible distractor descriptions drawn from other samples, enabling structured MCQ evaluation. All samples were manually verified by a senior domain expert with 100% intra-expert agreement, ensuring accurate alignment between images, labels, and text. The dataset covers four categories relevant to oil well monitoring: active wells, suspended wells, abandoned wells, and control samples without visible wells, yielding a balanced distribution for training and evaluation. We publicly release SatWellMCQ to support research on image grounding and vision-language adaptation in satellite imagery of oil wells.

We evaluate SatWellMCQ across state-of-the-art VLMs in zero-shot and supervised fine-tuning (SFT) settings. In the zero-shot setup, performance is moderate only for large-scale models, with the best result achieved by Qwen3-VL-235B at 0.670 accuracy. Compact models transfer poorly in zero-shot evaluation (e.g., Granite~3.3~2B at 0.422 and Phi-4-multimodal-instruct~6B at 0.376), highlighting the difficulty of domain-specific oil well analysis without targeted supervision. Supervised fine-tuning on SatWellMCQ yields substantial gains for compact models: Granite~3.3~2B improves to 0.722 and Phi-4-multimodal-instruct~6B reaches 0.730, surpassing all zero-shot baselines. These results show that SatWellMCQ poses a challenging benchmark for current VLMs while enabling effective domain adaptation through structured MCQ supervision.

Overall, SatWellMCQ provides a resource for post-training and benchmarking VLMs on image grounding of oil wells in satellite imagery and supports  geoscientific monitoring tasks relevant to environmental impact assessment and methane mitigation.

How to cite: Emam, A., Alrowili, S., Eswaran, M. K., Kinzler, R., and Samih, Y.: SatWellMCQ: A Vision–Language Satellite Datasetfor MCQ-Based Image Grounding of Oil Wells, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3405, https://doi.org/10.5194/egusphere-egu26-3405, 2026.

EGU26-4011 | ECS | Orals | GI2.1

Identifying valuable forest habitats for conservation in north-western Germany using AI and citizen science 

Katharina Horn, Daniele Silvestro, Christine Wallis, Pedro J. Leitao, Ender Daldaban, and Annette Rudolph

Around the globe we experience a significant biodiversity loss, mainly driven by direct anthropogenic exploitation, land use changes, and climate change. The most effective strategy to limit biodiversity loss is the designation and management of protected areas. Consequently, the European Union has adopted the EU Biodiversity Strategy for 2030, aiming to protect 30% of aquatic and terrestrial ecosystems by 2030. However, a consistent framework to designate protected areas across all EU member states is lacking. Additionally, the monitoring of biodiversity is challenged by the dynamic nature of the biological system, exacerbated by ongoing climate change, putting additional pressure on the member states in the identification of suitable areas for conservation. 

In contrast, the increasing amount of detailed geospatial and climatic data contains valuable information that can be used to optimise protected area designation. Recent developments in artificial intelligence and machine learning now provide us with powerful tools to best utilise these vast amounts of data. In this study, we develop a transparent and reproducible framework to prioritise protected areas in forests. Here we apply the CAPTAIN framework based on reinforcement learning (RL) to identify valuable forest habitats for conservation in the federal state of North Rhine-Westphalia (NRW), Germany. First, we model habitats of ten forest bird indicator species across the period of 2016-2024. Second, we use the changing habitat patterns to train a RL model that identifies 30% of the most valuable forest sites in the federal state. Finally, we model valuable forest sites under different policies (e.g., including or excluding opportunity costs for nature conservation) to illustrate how potential limitations of nature conservation management can be addressed. Our results indicate that forest sites in the south-east of NRW are most suitable for conservation. Furthermore, we find that including opportunity costs for nature conservation in the model predictions produces similarly strong outcomes for safeguarding the most endangered bird species. The framework makes use of open-source data and can be applied to any other region or country to support strategic nature conservation management.

How to cite: Horn, K., Silvestro, D., Wallis, C., Leitao, P. J., Daldaban, E., and Rudolph, A.: Identifying valuable forest habitats for conservation in north-western Germany using AI and citizen science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4011, https://doi.org/10.5194/egusphere-egu26-4011, 2026.

Geological mapping in complex metallogenic provinces often relies on band ratios and thresholding techniques. While effective for simple targets, these traditional methods struggle to capture non-linear spectral associations inherent in natural mineral mixtures and require significant prior knowledge of the target mineralogy. This study introduces a novel, data-driven unsupervised pipeline for mineral target generation, applied to the Aït Saoun region in the Moroccan Anti-Atlas, a strategic zone characterized by polymetallic occurrences (Cu, Co, Fe, Mn).

We leverage the full spectral topology of ASTER satellite imagery (VNIR-SWIR bands) rather than reduced indices. Our approach integrates topological manifold learning to reduce the high-dimensional spectral space, followed by density-based spatial clustering to delineate mineral clusters. This combination allows for the preservation of local data structure and the automated rejection of noise without human supervision.

The pipeline successfully identified spatially coherent clusters corresponding to specific hydrothermal alteration zones. It autonomously distinguished between structural iron-manganese anomalies and lithology-controlled copper mineralization a nuance often missed by standard linear ratios. The metallogenic relevance of these spectral clusters was rigorously validated through field mapping and geochemical analysis using Atomic Absorption Spectroscopy (AAS). Results confirmed economic grades in the predicted zones, yielding Copper concentrations up to 2.60% in propylitic alteration zones and Iron-Manganese oxide grades (21.94% Fe, 1.80% Mn) in tectonic corridors. Furthermore, the detection of distal barite anomalies highlights the method’s capability to map complete hydrothermal zonations.

These findings demonstrate that topological machine learning offers a robust, superior alternative to conventional remote sensing techniques for vectoring exploration targets in arid environments. By converting raw spectral data into validated metallogenic maps, this pipeline provides a scalable tool for de-risking early-stage mineral exploration in the Anti-Atlas.

How to cite: Elomairi, M. A. and El GAROUANI, A.: Automated Mineral Cluster Detection in ASTER Data Using Topological Machine Learning: A Novel Data-Driven Approach for Geological Exploration in Ait Saoun, Anti Atlas, Morocco, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4179, https://doi.org/10.5194/egusphere-egu26-4179, 2026.

EGU26-4956 | ECS | Posters on site | GI2.1

Electromagnetic & Cone Penetration Test Data Fusion on Soil Characterization 

Dimitrios Madelis, Marios Karaoulis, and Philippe De Smedt

Defining subsurface soil conditions in complex coastal settings requires the use of both geophysical and geotechnical datasets, each with different resolution and sensitivity. This study combined helicopter-borne electromagnetic (HEM) data, where large areas are spatially covered with limitations to vertical resolution, with cone penetration test (CPT) data, where high resolution can be achieved while the spatial resolution often is very sparse due to drilling associated costs. Τo formulate a continuous three-dimensional model of subsurface soil properties for levee risk assessment, these datasets were integrated. HEM data provides extensive covering resistivity profiles, while CPT provides high resolution, spatially limited measurements of mechanical soil behaviour.
It is known that resistivity as a soil property depends on many parameters (mostly water quality and soil type), and there is no straightforward method to directly translate it to soil, hence the use of ML. To deal with these complexities, we employed machine learning methods – Random Forests and neural networks – to merge heterogeneous datasets and predict continuous soil behaviour indices and discrete lithological types. We propose the use of multiple features, such as spatial coordinates, depths, distance from coast, soil types and local geological conditions. After pre-processing, machine-learning models were trained to fuse the datasets to ensure spatial consistency in the coastal environment. Afterwards, the Soil Behaviour Type Index (SBT) (Robertson, 1990) was calculated using the CPT measurements and then was discretized into lithological units.
A classical machine learning algorithm (Random Forest) and a PyTorch-based neural network were trained for regression (predicting the continuous SBT index) and classification (predicting soil types) tasks, and their performance was evaluated using standard statistical and visual metrics. Final models were retrained on the full dataset to increase generalizability and robustness. The final product is to map 𝐼𝑐 values and lithological classes at every HEM point and ultimately to make a 3D subsurface soil model. The outcome for each process was validated against an 80%-20% test to ensure reasonable results.
While regression models had similar RMSE scores, classification models generally produced models with greater accuracy of dominant soil types but captured fewer underrepresented mixed lithologies. This work focuses on the interpretability of soil models through integrating data (i.e., not just purely statistical but spatial output) and ultimately continuity in the spatial domain (where engineers are most concerned). The goal of this study is to develop a framework where continuous geophysical data, collected either by helicopters or drones can be combined with additional geological boreholes and CPTs and other geotechnical information, to enable us to image the subsurface beyond resistivity. One of the products of this study serves to represent an approach to providing a better product to those grappling with levee design and safety.

How to cite: Madelis, D., Karaoulis, M., and De Smedt, P.: Electromagnetic & Cone Penetration Test Data Fusion on Soil Characterization, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4956, https://doi.org/10.5194/egusphere-egu26-4956, 2026.

EGU26-5687 | ECS | Orals | GI2.1

Application of Gaussian Mixture Models for Geochemical Anomaly Detection 

Judith Jaeger, José I. Barquero, Julio A. López-Gómez, and Pablo Higueras

Geochemical prospecting is a fundamental tool in mineral exploration. Traditionally, the interpretation of geochemical data has relied on classical statistical methods, which in many cases are univariate or linear in nature and may fail to adequately capture the complex multivariate relationships among geochemical parameters. In this context, machine learning approaches offer an alternative framework for the integrated analysis of multivariate data and the identification of hidden patterns. 

This study evaluates the application of a Gaussian Mixture Model (GMM) as an unsupervised method for the identification of geochemical anomalies of potential geological interest. The analysis was conducted on a dataset of 114 soil samples collected from the southwestern sector of the province of Ciudad Real. Before the application of the GMM, an exploratory statistical analysis was performed, including the Kaiser–Meyer–Olkin (KMO) test and the Measure of Sampling Adequacy (MSA), aimed to assess the suitability of the variables for multivariate analysis. 

After conducting several experiments, the results indicate that the Gaussian Mixture Model can identify zones with anomalous values consistent with geological interest, highlighting its potential as a supportive tool in geochemical prospecting. 

How to cite: Jaeger, J., Barquero, J. I., López-Gómez, J. A., and Higueras, P.: Application of Gaussian Mixture Models for Geochemical Anomaly Detection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5687, https://doi.org/10.5194/egusphere-egu26-5687, 2026.

EGU26-6107 | ECS | Posters on site | GI2.1

Forecasting Offshore Caisson Tilt via Deep Learning: A Numerical Simulation-Based Approach Accounting for Geotechnical Uncertainty 

Saeyon Kim, Jingi Hong, Inyoung Huh, and Heejung Youn

This study presents a comparative analysis of time-series forecasting models to predict caisson tilt using early-stage monitoring data. To establish a training dataset that accounts for inherent geotechnical uncertainty, 1,000 2D numerical simulations were performed using PLAXIS2D, based on an actual design case in South Korea. To incorporate spatial variability, the subsurface was discretized into 61 independent zones: Deep Cement Mixing (33 zones), foundation rubble (6 zones), backfill rubble (10 zones), and underlying heaving soil (12 zones). Geotechnical parameters including elastic modulus (E), undrained shear strength (Su), and interface strength reduction factor (Rinter), were varied by up to 50% of their design values. Latin Hypercube Sampling (LHS) was used to assign geotechnical properties to each zone. Each case simulated a 28-stage construction sequence, with caisson tilt extracted at each stage to generate time-series data.

Four forecasting models such as ARIMA, LSTM, Temporal Convolutional Network (TCN), and an encoder-only Transformer, were evaluated. The dataset was split into 680 simulations for training, 170 for validation, and 150 for testing. Forecasting performance was assessed across varying initial observation lengths (cut = 3, 5, 10, 15, and 20 stages) to predict all remaining future stages. Results indicate that while the statistical baseline (ARIMA) showed consistently high errors regardless of observation length, with RMSE values of approximately 0.09 at cut = 3 and 0.08 at cut = 10. In contrast, deep learning models exhibited clear error reductions as more initial observations became available. Among the tested models, the TCN achieved the highest accuracy, with RMSE values of approximately 0.006 at cut = 10 and 0.004 at cut = 15. The encoder-only Transformer model also maintained stable performance for cut ≥ 10, with RMSE values below 0.01.

Acknowledgements This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2023R1A2C1007635).

How to cite: Kim, S., Hong, J., Huh, I., and Youn, H.: Forecasting Offshore Caisson Tilt via Deep Learning: A Numerical Simulation-Based Approach Accounting for Geotechnical Uncertainty, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6107, https://doi.org/10.5194/egusphere-egu26-6107, 2026.

EGU26-6766 | ECS | Posters on site | GI2.1

Rapid Bayesian Geophysical Inversion Using General Geophysical Neural Operator 

heng zhang and yixian xu

Bayesian inversion provides a rigorous framework for uncertainty quantification in geophysics, but is often computationally prohibitive due to the reliance on Markov Chain Monte Carlo (MCMC) sampling, which requires massive numbers of forward simulations. While deep learning surrogate models offer acceleration, existing architectures (e.g., CNNs, FNO, DeepONet) often struggle with fixed discretization constraints and cannot flexibly handle the irregular observation coordinates typical in field surveys.

To address these challenges, we propose the General Geophysical Neural Operator (GGNO), a novel Transformer-based architecture designed for mesh-independent operator learning. This design fulfills three fundamental requirements for forward solvers in the context of practical inversion: (1) Discretization-invariant, allowing the processing of input models with different mesh resolutions; (2) Prediction-free, enabling direct solution querying at arbitrary spatio-temporal coordinates; and (3) Domain-independent, decoupling input and output discretizations. 

We validate GGNO on Magnetotelluric (MT) forward modeling, demonstrating exceptional generalization while achieving accuracy two orders of magnitude higher than traditional methods. By integrating GGNO into a Bayesian framework, we achieve highly efficient MCMC sampling, reducing the computational time from tens of days to a few minutes, which allows for a comprehensive exploration of the posterior distribution. Applied to field data, this approach successfully recovers complex subsurface resistivity structures with rigorous uncertainty bounds. These results highlight GGNO's potential to enable high-precision subsurface imaging and robust probabilistic interpretation for complex geophysical exploration.

How to cite: zhang, H. and xu, Y.: Rapid Bayesian Geophysical Inversion Using General Geophysical Neural Operator, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6766, https://doi.org/10.5194/egusphere-egu26-6766, 2026.

EGU26-7774 | ECS | Posters on site | GI2.1

Probabilistic Reconstruction of Sentinel-2 Satellite Image Time Series Using Multi-Sensor Gaussian Process Models 

Bastien Nespoulous, Alexandre Constantin, Dawa Derksen, and Veronique Defonte

Satellite Image Time Series (SITS) are a cornerstone of Earth observation, enabling long-term monitoring of environmental processes such as vegetation dynamics, land-use change, and natural hazards. However, optical satellite time series, including Sentinel-2, are frequently irregular and incomplete due to cloud cover, atmospheric effects, and acquisition constraints, which strongly limit their usability in operational monitoring systems. In contrast, Sentinel-1 Synthetic Aperture Radar (SAR) provides regular observations for any weather condition and offers complementary information for mitigating optical sensor limitations. Generating dense and reliable Sentinel-2 time series from multi-sensor observations therefore remains a critical challenge.

This work investigates Gaussian Process (GP) based statistical models for the reconstruction and densification of Sentinel-2 image time series by jointly exploiting Sentinel-1 and Sentinel-2 data. Gaussian Processes offer a flexible Bayesian framework for pixel interpolation and extrapolation. We explore GP formulations capable of handling irregular temporal sampling, multi-output dependencies, and latent variable structures, enabling the fusion of heterogeneous optical and radar observations.

An in-depth analysis of the state-of-the-art is conducted, covering multi-output Gaussian Processes, sparse and variational approximations for scalability, latent variable models (including hierarchical GP-LVMs), and inverse GP approaches based on shared latent spaces. These methods are evaluated with respect to three key challenges: ensuring spatio-temporal coherence of reconstructed images, fusing asynchronous multi-sensor observations, and maintaining computational tractability for large-scale satellite datasets.

To support experimental investigations, a representative multi-regional dataset is constructed over mainland France and overseas territories, capturing diverse climatic patterns, land-cover types, and cloud conditions, including extreme events such as flooding. 

This study establishes the methodological foundations for reconstructing dense Sentinel-2 time series conditioned on Sentinel-1 observations, with explicit uncertainty quantification. By leveraging Sentinel-1 data, the approach effectively imputes missing Sentinel-2 values while providing consistent average pixel estimates with associated uncertainty, which is critical for geoscience applications. The proposed framework contributes toward more robust Earth observation monitoring systems and the development of reliable geospatial digital twins.

How to cite: Nespoulous, B., Constantin, A., Derksen, D., and Defonte, V.: Probabilistic Reconstruction of Sentinel-2 Satellite Image Time Series Using Multi-Sensor Gaussian Process Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7774, https://doi.org/10.5194/egusphere-egu26-7774, 2026.

EGU26-7860 | Posters on site | GI2.1

Densification and forecasting of Sentinel-2 time series from multimodal SAR and optical satellite data using deep generative models 

Véronique Defonte, Dawa Derksen, Alexandre Constantin, and Bastien Nespoulous

Sentinel-2 optical image time series are a key source of information for many Earth observation applications, including climate monitoring, agriculture, ecosystem dynamics, and land surface change analysis. Dense and regular observations are essential to accurately capture seasonal patterns, abrupt events, and long-term trends. However, in practice, Sentinel-2 time series are often sparse and irregular due to cloud cover and varying acquisition conditions. These limitations significantly complicate continuous monitoring and the analysis of surface dynamics. Moreover, beyond time series densification, there is a growing need to anticipate future optical observations to support scenario analysis, early warning systems, and predictive environmental monitoring.

To address these challenges, we propose a deep learning–based framework for densifying Sentinel-2 time series by generating plausible optical images at arbitrary past or future dates. The approach relies on multimodal satellite observations, jointly exploiting optical Sentinel-2 and radar Sentinel-1 data. Indeed, SAR measurements are insensitive to cloud cover and provide complementary structural and temporal information. This multimodal setting enables the reconstruction of missing observations and the prediction of future optical states while preserving realistic spatio-temporal dynamics.

From a methodological perspective, the model is explicitly designed to handle sparse, incomplete, and temporally misaligned multimodal time series. It operates on temporal sets of Sentinel-2 and Sentinel-1 images acquired at irregular dates around a target time. A cross-attention mechanism is used to explicitly model interactions across time and modalities, allowing the network to identify and weight the most relevant observations for generating a Sentinel-2 image at a given target date.

In addition, the proposed framework incorporates a probabilistic decoder that estimates not only the predicted Sentinel-2 image but also an associated uncertainty map. This uncertainty estimation provides valuable insight into the confidence of the generated pixels, which is particularly important for downstream applications such as anomaly detection, risk assessment, and decision-making support.

The model is evaluated across multiple geographical regions and land-cover types, demonstrating strong performance in both densification and forecasting tasks. Results show that the proposed approach successfully preserves the temporal dynamics of the scenes, notably by accurately reproducing vegetation phenology as reflected in NDVI time series. Forecasting experiments further highlight the importance of radar information: Sentinel-1 observations close to the target date allow the model to detect surface changes occurring after the last available optical image, thereby improving future predictions. Overall, the proposed method represents a step towards the densification and forecasting of Sentinel-2 time series, offering a promising direction for future methodologies aimed at continuous Earth surface monitoring and predictive analysis.

How to cite: Defonte, V., Derksen, D., Constantin, A., and Nespoulous, B.: Densification and forecasting of Sentinel-2 time series from multimodal SAR and optical satellite data using deep generative models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7860, https://doi.org/10.5194/egusphere-egu26-7860, 2026.

EGU26-8542 | Orals | GI2.1

A Big Data and Text Mining–Based Media Analysis Framework for Disaster Cause Investigation 

Jin Eun Kim, Heeyoung Shin, and Sengyong Choi

 As similar disasters and accidents continue to occur, public concern about the limitations of existing disaster response systems and the need for institutional improvement is increasing. The National Disaster Management Research Institute of Korea conducts disaster cause investigations as part of its statutory responsibilities, examining problems observed before and after disasters, institutional weaknesses, and public demands for improvement. In this context, news data provide valuable unstructured information that reflects on-site conditions, response activities, policy debates, and public opinion, and thus complement official investigation records in understanding institutional and managerial factors related to disasters.


 This study aims to develop a media analysis framework based on big data and text mining for use in disaster cause investigations. Disaster-related news articles were first collected, and a large language model (Gemini) was applied to identify and extract sentences that describe problems and suggested improvements in the stages of disaster occurrence and response. The extracted sentences were then processed using natural language processing techniques, including stopword removal and the merging of duplicate and semantically similar sentences. Based on semantic similarity, the remaining sentences were grouped to organize major issues. In addition, nouns were extracted and their frequencies were analyzed by year to identify key terms and to examine changes in topics emphasized in media coverage.
 

 Applying the proposed framework to the disaster cause investigation of the 2023 Osong Underpass Flooding Disaster conducted in 2025, we identified 21 problem items grouped into seven categories, such as insufficient pre-closure of the underpass and inadequate maintenance of river embankments. In addition, 17 improvement measures were derived in six categories, including improvements to underpass closure criteria and flood risk grading, as well as the strengthening of river management practices, and were systematically organized and proposed. The results indicate that combining news big data, text mining, and large language models can effectively structure key issues and institutional weaknesses, and can serve as a useful analytical tool for strengthening the evidence base and explanatory power of disaster cause investigations.

How to cite: Kim, J. E., Shin, H., and Choi, S.: A Big Data and Text Mining–Based Media Analysis Framework for Disaster Cause Investigation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8542, https://doi.org/10.5194/egusphere-egu26-8542, 2026.

EGU26-10174 | ECS | Orals | GI2.1

A global–hierarchical–categorical alignment framework to address sample scarcity and domain shift in crop mapping 

Jingya Yang, Qiong Hu, Mariana Belgiu, and Wenbin Wu

The scarcity and high acquisition cost of field crop samples remain a major bottleneck for applying Artificial Intelligence (AI)–driven supervised learning methods in large-scale geoscientific applications such as crop type mapping. Meanwhile, crop phenology and, consequently, spectra-temporal characteristics of the same crop type present significant interannual and regional variations due to the differences in local conditions and human activities, such as climatic, soil properties and farming practices. This causes the “domain shift” challenge. Therefore, directly applying a classification model trained in a specific region and year to a new region or year inevitably leads to poor prediction performance. The gap between the abundant availability of Earth Observations imagery and the limited accessibility of training crop samples hider efficient mapping of varied crop types across large regions. To address training sample scarcity and cross-region/year domain shift in large-scale crop type mapping, we propose a transferable crop mapping method named Global-Hierarchical-Categorical feature Alignment (GHCA). GHCA integrates unsupervised domain adaptation, contrastive learning, and pseudo-labeling to achieve multi-dimensional alignment between source domain and target domain at global, hierarchical and categorical levels. The developed method enables accurate and transferable crop mapping across diverse agricultural landscapes with minimum field survey requirements. The main contributions of our study can be summarized as follows: (1) A global feature pre-alignment mechanism is introduced by calculating the Multi-Kernel Maximum Mean Discrepancy (MK-MMD) metric across different hierarchical features to align source and target domains in global and hierarchical feature spaces. This mechanism substantially improves the initial reliability of pseudo-labels generated for the target domain, providing a reliable foundation for subsequent fine-grained categorical level feature alignment; (2) A robust pseudo-label generation strategy is developed by jointly considering prediction confidence, prediction certainty, and prediction stability. Reliable pseudo-labels for target domain are selected by calculating model prediction probabilities and predictive uncertainty estimates through teacher-student model. Moreover, the Exponential Moving Average (EMA) strategy is adopted to updated model parameters in the teacher path to enable the acquisition of obtaining more stable pseudo-labels; (3) Category-wise feature alignment is achieved by integrating pseudo-labeling with contrastive learning, which explicitly pulls intra-class feature closer for the same crop types across source and target domains, while pushing inter-class feature apart for different crop types. The effectiveness of the proposed GHCA method for both cross-region and cross-year crop mapping was evaluated across five regions in China and the U.S. over a two-year timeframe. GHCA was compared with a machine learning method (RF), supervised deep learning models (DCM, Transformer, and PhenoCropNet), and transfer learning methods (DACCN, PAN, and CSTN) for cross‑year and cross‑region crop mapping. Experimental results showed that GHCA outperformed other models in most transfer cases, with OA ranging from 0.82 to 0.95 (cross-region) and 0.89 to 0.98 (cross-year), achieving an average OA increase of 6.2% and 3.5% in cross-region and cross-year experiments, respectively. These results highlight the strong potential of advanced AI methodologies to deliver robust, quantitative, and transferable solutions for complex geoscientific problems using large Earth observation datasets.

How to cite: Yang, J., Hu, Q., Belgiu, M., and Wu, W.: A global–hierarchical–categorical alignment framework to address sample scarcity and domain shift in crop mapping, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10174, https://doi.org/10.5194/egusphere-egu26-10174, 2026.

This study introduces an innovative methodology for generating realistic soil prediction maps that visualise the spatial distribution of specific chemicals, achieved through the rigorous evaluation and comparison of advanced modelling techniques, including innovative modelling techniques based on the use of neural networks and multilayer perceptrons (MLPs). The Drava River floodplain was selected as the primary case study based on stringent criteria: a) intensive historical metal ore mining and metallurgical processing activities, which have left a legacy of contamination; b) distinctive geomorphological features, such as dynamic floodplains and sediment deposition zones; and c) diverse geological settings that facilitate reliable model calibration across transboundary reaches. Soil measurements were integrated with diverse geospatial datasets—derived from Digital Elevation Models (DEMs), land cover classifications, and remote sensing imagery—to enable high-resolution mapping of contaminant distributions via sophisticated predictive modelling powered by neural networks and MLPs. A novel, holistic approach was applied to simultaneously reconstruct multiple influencing processes, including erosion, sediment transport, and pollutant dispersion, across the entire study area. This comprehensive framework not only advances contamination mapping practices but also empowers the developed models to trace primary distribution pathways, quantify the true extent of affected zones, enhance data interpretability, and inform evidence-based decisions on land-use planning, remediation strategies, and environmental management in mining-impacted regions.

How to cite: Alijagić, J. and Šajn, R.: Advanced AI Soil Mapping Techniques and Transboundary Risk Assessment for the Drava River Floodplain , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10465, https://doi.org/10.5194/egusphere-egu26-10465, 2026.

EGU26-11012 | Posters on site | GI2.1

Deep-learning based large-scale automated observation of earthquake surface ruptures 

Xin Liu, Shirou Wang, Xuhua Shi, Cheng Su, Yann Klinger, Arthur Delorme, Haibing Li, Jiawei Pan, and Hanlin Chen

Rapid and objective mapping of co-seismic surface ruptures is essential for post-earthquake impact assessment and for improving our understanding of fault geometry, stress transfer, and rupture processes that inform longer-term seismic hazard analyses. However, rupture mapping has traditionally relied on manual interpretation of field observations or remote-sensing data, which is time-consuming and difficult to extend consistently to large spatial extents, multiple earthquakes, and diverse data sources. Here we present an automated deep-learning framework—the Deep Rupture Mapping Network (DRMNet)—a convolutional neural network designed for end-to-end, high-precision detection of co-seismic surface ruptures from multi-sensor imagery. DRMNet is applied to four large continental earthquakes: the 2021 Mw 7.4 Maduo, 2022 Mw 6.9 Menyuan, 2001 Mw 7.8 Kokoxili, and 1905 Mw ~8 Bulnay (Mongolia) events. The framework consistently delineates both primary and subsidiary rupture structures across centimetre-scale drone imagery and metre-scale satellite data. Across diverse tectonic settings, image resolutions, and preservation states, DRMNet achieves precisions approaching or exceeding 90%. By enabling consistent rupture recognition across multiple events, sensors, and timescales, the proposed framework overcomes the event-specific and local-scale limitations of previous approaches, supporting both rapid post-earthquake response and retrospective rupture reconstruction, and laying the groundwork for standardized global surface-rupture inventories.

How to cite: Liu, X., Wang, S., Shi, X., Su, C., Klinger, Y., Delorme, A., Li, H., Pan, J., and Chen, H.: Deep-learning based large-scale automated observation of earthquake surface ruptures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11012, https://doi.org/10.5194/egusphere-egu26-11012, 2026.

EGU26-12550 | ECS | Posters on site | GI2.1

Identifying zircon provenances using domain-adversarial neural network 

Mengwei Zhang, Guoxiong Chen, Timothy Kusky, Mark Harrison, Qiuming Cheng, and Lu Wang
  • Zircon trace element geochemistry is a pivotal tool for unraveling petrogenesis and the evolutionary history of the Earth’s crust. While two-dimensional (2D) discriminant diagrams are conventionally used to identify parent rock types, the emergence of machine learning (ML) has introduced a transformative research paradigm. ML not only enhances classification accuracy but also resolves the inherent ambiguities found in traditional geochemical diagrams. However, the reliability of current ML models typically depends on the vast archives of labeled samples from the Phanerozoic. When extending research to “deep-time” samples, such as Hadean zircons, the scarcity of labeled data often forces researchers to rely on models trained exclusively on Phanerozoic datasets. This approach is prone to misclassification due to “domain shift,” caused by systematic variations in zircon trace element distributions across different geological eons. To address this challenge, we propose a Domain Adversarial Neural Network (DANN) framework tailored for zircon trace element analysis. By aligning the feature distributions of the source domain (Phanerozoic) and the target domain (Precambrian), the DANN extracts “domain-invariant yet geologically significant” high-dimensional feature representations, effectively mitigating the effects of temporal data bias. Our results demonstrate that DANN significantly outperforms traditional machine learning methods across multiple performance metrics. Furthermore, t-SNE visualization confirms that the source and target domains are effectively aligned within the feature space. When applied to ~4.3 Ga zircon samples from the Jack Hills, the model achieved a classification accuracy of 0.923. This high level of performance underscores the framework’s exceptional generalization capability for identifying unlabeled deep-time samples and its potential for broader applications in Precambrian geology. This study develops a transferable, data‑driven framework for inferring deep‑time geological processes, providing a novel methodology to address the limitations inherent in the traditional principle of uniformitarianism. Furthermore, the framework is extensible to other mineral systems (e.g., apatite, monazite), thereby opening new avenues for quantitatively reconstructing the dynamic evolution of the early Earth.

How to cite: Zhang, M., Chen, G., Kusky, T., Harrison, M., Cheng, Q., and Wang, L.: Identifying zircon provenances using domain-adversarial neural network, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12550, https://doi.org/10.5194/egusphere-egu26-12550, 2026.

EGU26-13366 | ECS | Orals | GI2.1

Spatial Downscaling of Land Surface Temperature Using Sentinel-2 and Sentinel-3 Data Fusion for Agricultural Applications 

Bouchra Boufous, Fatima Ben zhair, and Salwa Belaqziz

Land surface temperature (LST) is a key variable for assessing crop thermal stress and supporting precision agriculture. However, thermal satellite products often involve a trade-off between spatial and temporal resolution. Sentinel-3 provides frequent LST observations, but its coarse spatial resolution limits its use for field-scale agricultural monitoring.

This study proposes a spatial downscaling approach for LST based on the fusion of Sentinel-3 thermal data with high-resolution multispectral information from Sentinel-2. The method exploits the inverse relationship between surface temperature and vegetation cover through the Normalized Difference Vegetation Index (NDVI). A linear regression model was developed to estimate LST at a spatial resolution of 10 m using Sentinel-2 NDVI as the primary predictor.

The approach was applied over the agricultural site of El Ghaba in the Marrakech–Safi region (Morocco), covering different crop types, including annual cereals (barley, wheat, and kerenza) and perennial olive orchards. Results show a clear negative correlation between NDVI and LST, confirming the regulatory role of vegetation on surface temperature. The downscaled LST maps reveal fine-scale spatial heterogeneity that is not detectable in the original Sentinel-3 product.

Quantitative evaluation indicates low absolute errors for annual crops (generally below 0.5 °C), demonstrating the robustness of the proposed method, while higher discrepancies observed for olive orchards highlight the complexity of perennial crop thermal behavior. This work enhances the spatial usability of satellite thermal data for agricultural monitoring and crop stress assessment.

How to cite: Boufous, B., Ben zhair, F., and Belaqziz, S.: Spatial Downscaling of Land Surface Temperature Using Sentinel-2 and Sentinel-3 Data Fusion for Agricultural Applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13366, https://doi.org/10.5194/egusphere-egu26-13366, 2026.

EGU26-13941 | Orals | GI2.1

Why Automated Mineralogy needed an upgrade 

Rich Taylor

Automated Mineralogy – the past

The automated classification of mineral phases in rocks has been a mainstay of the Geoscience analytical community for over 40 years. While we have seen great leaps forward in AI in µCT and light microscopy/petrography, the automated capabilities for the SEM have progressed and changed very little in decades, relying heavily on outdated methods that were available at the time.

The technology come with several significant problems moving forward, including excessive hardware-software dependencies, complex mineral libraries and classifications, inconsistent user experience, and difficult workflows outside their intended use.

 

Recent technological advances

There are two broad shifts that are taking place across a number of microscopy and microanalysis techniques – the acquisition of more quantitative data, and the application of deep learning neural networks. As a general trend this can be thought of as building better datasets, and building bigger datasets.

EDS as a SEM-based technique is fertile territory for both of these shifts. As an analytical technique EDS is commonly applied qualitatively, or as an image based method for distinguishing regions based on chemical maps. In recent years it has become easier than ever before to calibrate systems and detectors for concentration data, meaning the SEM can generate more robust datasets without having to fall back on other techniques.

Deep Learning is a topic that covers a broad range of mathematical applications to everything from the acquisition of microscopy datasets, through to data processing and interpretation across almost all sciences. There are many different flavours of deep learning neural network (DLNN) and each type lends itself to different applications, particularly in the varied data rich environments of microscopy. DLNN are inherently hard to track exactly how they operate, but at their best should be easy to use, and easy to understand how they’ve been applied to a scientific problem.

 

Automated Mineralogy – the future

The introduction of both quantitative mineral chemistry and DLNN to automated mineral classification is a huge leap forward, solving many of the problems of traditional software. Detaching data acquisition from processing removes software dependencies and frees users to build their ideal system. An DLNN-driven, unsupervised data processing approach can be data led rather than user led, making it more robust and consistent across instruments and facilities. Quantitative analysis can build on the DLNN approach by allowing a “best fit” classification, removing the need for constant modification of mineral libraries, and simply allowing “textbook” globally consistent mineral compositions to drive the labelling of segmented data.

How to cite: Taylor, R.: Why Automated Mineralogy needed an upgrade, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13941, https://doi.org/10.5194/egusphere-egu26-13941, 2026.

EGU26-14415 | ECS | Orals | GI2.1

Multi-agent Geochemical Literature Data Mining System 

Tianyu Yang, Karim Elezabawy, Daniel Kurzawe, Leander Kallas, Marie Traun, Bärbel Sarbas, Adrian Sturm, Stefan Möller-McNett, Matthias Willbold, and Gerhard Wörner

The increasing volume and complexity of geochemical literature pose major challenges for the sustainable curation of domain-specific databases such as GEOROC (Geochemistry of Rocks of the Oceans and Continents), the world’s largest repository of geochemical and isotopic data from igneous and metamorphic rocks and minerals, aggregating more than 41 million values from over 23,000 publications. Although GEOROC underpins a wide range of geoscientific research, the extraction and harmonization of metadata from publications still relies heavily on manual effort, which significantly limits the scalability.

In this contribution, we present a novel information extraction architecture that moves beyond linear processing pipelines toward an Large Language Model (LLM)-based multi-agent system combining document layout analysis, schema-driven reasoning, and modality-aware extraction. Unlike generic LLM approaches that treat documents as continuous text streams, our architecture adopts a "Visual-First" strategy. We utilize a layout-aware backbone (MinerU, Niu et al., 2025) to decompose PDF manuscripts into a sequence of geometrically grounded primitive blocks, each representing a localized document region with associated visual and typographic features, preserving the geometric grounding essential for interpreting complex data tables. A routing agent subsequently validates and refines the initial layout classification, dynamically dispatching blocks to specialized downstream agents for text, table, or figure processing. This adaptive routing strategy improves robustness against layout variability across journals, publication years, and formatting styles.

Central to the framework is an active schema agent that operationalizes the GEOROC metadata model. Rather than treating the database schema as a static template, this agent continuously provides extraction targets, normalization rules, unit standards, and conflict-resolution policies that guide all subsequent processing steps. Text blocks are handled by an  Optical Character Recognition (OCR) driven information extraction agent, table blocks by a table parsing agent capable of reconstructing complex table structures, and figure blocks by a visual reasoning agent designed to interpret diagrams and digitize plotted values. Each agent produces structured candidate values enriched with confidence estimates and fine-grained provenance, including page-level and bounding-box references to the original document.

The outputs of these modality-specific agents are consolidated by a merge-and-judge agent, which goes beyond simple aggregation. This agent performs cross-modal arbitration, unit harmonization, and deduplication, resolving conflicts between heterogeneous sources according to schema-defined priorities and data-quality criteria. The final result is a machine-readable JSON representation that preserves both extracted values and their evidential context.

By combining layout grounding, adaptive routing, schema-driven reasoning, and judgment-based integration, this system delivers a robust and extensible approach to large-scale metadata extraction. The framework substantially supports the curation process and strengthens GEOROC’s role as a FAIR-compliant reference infrastructure by enabling more efficient reuse of published geochemical data in future geochemical research.

References:

Niu, J., Liu, Z., Gu, Z., Wang, B., Ouyang, L., Zhao, Z., ... & He, C. (2025). Mineru2. 5: A decoupled vision-language model for efficient high-resolution document parsing. arXiv preprint arXiv:2509.22186.

How to cite: Yang, T., Elezabawy, K., Kurzawe, D., Kallas, L., Traun, M., Sarbas, B., Sturm, A., Möller-McNett, S., Willbold, M., and Wörner, G.: Multi-agent Geochemical Literature Data Mining System, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14415, https://doi.org/10.5194/egusphere-egu26-14415, 2026.

EGU26-14874 | ECS | Posters on site | GI2.1

Quartz grain microtexture analysis using Artificial Intelligence: application to tsunami and storm deposits provenance studies 

Natércia Marques, Pedro Costa, and Pedro Pina

Quartz grain surface microtextures observed by scanning electron microscopy (SEM) provide important information on sediment transport history, depositional processes and sediment provenance. Traditionally, the interpretation of these features has relied upon qualitative visual assessment—an approach deeply rooted in expert judgement and cumulative experience. While fundamental, this methodology is inherently susceptible to subjectivity and inter-analyst variability. To counter balance this problem, we explore image-based classification approaches (utilizing Deep Learning frameworks) as a tool to support quartz microtextural analysis and assist in the identification of likely depositional environments thus establishing sediment provenance relationships.

A dataset of 3 367 SEM images was compiled, spanning a diverse range of sedimentary contexts: aeolian dunes, beach faces’, alluvial systems, basal sands, and nearshore, alongside with high-energy deposits from storm and tsunami events. Based on this dataset, five classification models were developed. Three were designed to discriminate between the full set of seven depositional classes, while two focused on a reduced classification scheme comprising four classes (alluvial, beach, dune and nearshore). All models were optimised using an increasing number of training epochs to assess the stability and evolution of classification performance. The results obtained were further examined in comparison with SandAI, an existing tool for microtexture classification, to evaluate its behaviour when applied to new sedimentary contexts and datasets acquired under different conditions.

The most consistent classification results were obtained for environments characterised by well-preserved and distinctive mechanical microtextures (e.g. aeolian sediments). Conversely, while environments defined by overlapping processes occasionally yielded higher nominal accuracies in QzTexNet (CNN-based models developed within the scope of this work), this is potentially attributed to their over-representation in the dataset. Analysis of classification outcomes indicates that microtextural overprinting, dataset imbalance and variations in image quality reduced the visibility of diagnostic features, thereby complicating the differentiation of depositional settings. Nevertheless, the data suggests that our models successfully capture sedimentologically meaningful patterns when surface textures remain clear. While SandAI showed stable performance within its original scope, its accuracy was limited, peaking at 47% for its target environments and dropping significantly when faced with complex deposits like tsunami or nearshore grains. In contrast, the newly developed QzTexNet models showed slightly more encouraging results, reaching accuracies of around 55% and demonstrating a steady improvement through successive refinements.

Ultimately, these findings demonstrate that automated classification offers a powerful complement to traditional analysis, particularly in ensuring reproducibility across large-scale datasets. Solely based on our database, it was observed that challenges regarding dataset equilibrium and textural complexity persist, targeted methodological refinements and supervised training hold significant potential. Such advancements represent a promising frontier in sedimentary provenance studies, particularly for the rigorous identification of deposits linked to extreme geological events.

This work is supported by FCT, I.P./MCTES through national funds (PIDDAC): LA/P/0068/2020 (https://doi.org/10.54499/LA/P/0068/2020), UID/50019/2025(https://doi.org /10.54499/UID/PRR/50019/2025), UID/PRR2/50019/2025). Finally this work is a contribution to project iCoast (project 14796 COMPETE2030-FEDER-00930000).

How to cite: Marques, N., Costa, P., and Pina, P.: Quartz grain microtexture analysis using Artificial Intelligence: application to tsunami and storm deposits provenance studies, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14874, https://doi.org/10.5194/egusphere-egu26-14874, 2026.

Wildfires are increasingly reshaping landscapes across the U.S., disrupting hydrogeologic processes such as runoff, infiltration, and sediment transport—posing major challenges for streamflow prediction and water resource management. Traditional conceptual and physically based hydrologic models often struggle to capture these disturbance-driven dynamics. In this study, we explore the potential of long short-term memory (LSTM) networks, a type of recurrent neural network, to simulate post-fire streamflow across 1,082 fire-affected basins spanning the contiguous U.S.—representing the first near-continental-scale application of LSTMs for wildfire-related hydrologic prediction. 

Three LSTM models were trained on different temporal splits of fifteen-year datasets containing wildfire events: one using pre-fire data, one using post-fire data, and one using the full dataset. Models were evaluated on unseen basins in both pre- and post-fire windows. Results show that the model trained on the full dataset consistently outperformed the others, underscoring the importance of temporally diverse training data that include disturbance events. Importantly, LSTMs demonstrated strong generalization across disturbed and undisturbed environments, highlighting their ability to learn hydrologic patterns beyond the constraints of traditional process-based modeling frameworks. 

Feature importance analysis revealed that topographic variables (e.g., elevation and slope) were most influential, followed by soil/geologic and vegetation characteristics, while fire-specific indicators (e.g., burn severity) ranked surprisingly low. This suggests that the LSTMs internalized key controls on streamflow response without heavy reliance on the explicit disturbance metrics included. To further isolate the model’s learned response to wildfire, simulations were performed with synthetic unburned conditions for each disturbed basin and compared against burned scenarios. Spatial analysis by EPA Level II ecoregion revealed that in the Southeastern U.S., Ozark/Appalachian Forests, and Mediterranean California, the model identified a persistent, multi-year increase in streamflow-lasting up to three years after wildfire. These regions share ecological characteristics such as high vegetation biomass, seasonal climate regimes, and terrain-driven hydrologic gradients that collectively amplify post-fire reductions in evapotranspiration and enhance runoff generation. In contrast, no significant streamflow change was detected in the Western Cordillera, South Central Prairies or Cold Desert ecoregions, where water-limited climates and lower fuel loads results in a dual-action response of hydrologic buffering and constrained post-fire increases in water yield.    

Together, these findings demonstrate that LSTMs can detect regionally coherent hydrologic responses to wildfire even in the absence of strong dependence on explicit disturbance features, highlighting the promise of AI-driven, data-centric approaches for modeling hydrologic change in an era of increasing disturbances. As wildfires and other extreme events become more frequent, integrating machine learning into hydrologic prediction frameworks offers a powerful pathway toward adaptive water resource management and improved resilience across diverse ecohydrologic settings. 

How to cite: Hogue, T., Moon, C., and Corona, C.: Quantifying Post‑Wildfire Hydrologic Response Using LSTMs: Ecoregion Patterns Across the Contiguous United States, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15295, https://doi.org/10.5194/egusphere-egu26-15295, 2026.

EGU26-15913 | ECS | Posters on site | GI2.1

AI-assisted Remote Sensing Screening of Potential Natural Hydrogen Seepage Features in Alta Guajira, Northern Colombia 

Miguel Angel Monterroza Montes, Stephanie San Martín Cañas, Boris Lora-Ariza, and Leonardo David Donado

Natural (geological) hydrogen refers to molecular hydrogen produced in the subsurface through abiotic and biogenic pathways, which may migrate, accumulate transiently, be consumed by secondary reactions, or escape to the surface. Increasing evidence indicates that such systems could be a strategic low-carbon energy source, but their exploration is limited as regional-scale, data-driven approaches to identify mechanisms of active or fossil migration in geologically complex environments are lacking. Surface expressions such as circular and sub-circular depressions associated with soil and vegetation anomalies have been reported worldwide as indirect indicators of hydrogen migration and leakage. However, their detection remains limited to either local reconnaissance of the field or manual interpretation of remote-sensing data. In this research, we present an AI-assisted remote sensing framework to conduct a regional screening based on the potential for natural hydrogen seepage patterns to enhance early-stage exploration and improve the quantitative characterization of surface indicators linked to subsurface energy systems. Deep-learning–based computer vision models are used to study high-resolution satellite imagery and automatically identify and classify circular and sub-circular geomorphological features that could correspond to hydrogen exudation. The resulting detections are integrated into a GIS framework for the extraction of morphometric and spatial statistics, providing a formal analytical benchmark to relate surface structures to lithology, structural configuration, and the regional tectonic setting. The workflow is applied to the Alta Guajira region (in northern Colombia), a geologically complex segment of the Caribbean margin characterized by accreted oceanic crust, major fault systems, and sedimentary depocenters that may favor hydrogen generation and migration. Using an AI-based approach allows the construction of a regional inventory of candidate seepage-related structures while significantly reducing false positives associated with purely morphology-based analyses. The results support the prioritization of targets for future field verification, geochemical sampling, and subsurface investigations. Beyond its implications for natural hydrogen prospectivity, the proposed methodology demonstrates how artificial intelligence can translate qualitative geological observations into quantitative, reproducible screening tools. By providing a transparent and spatially explicit representation of subsurface energy systems, AI-assisted screening also facilitates communication with stakeholders and local communities, contributing to informed public perception of emerging sustainable subsurface energy resources in data-limited regions such as Alta Guajira.

The researchers thank the SHATKI Research Project (code 110563), Contingent Recovery Contract No. 112721-042-2025, funded by the Ministry of Science, Technology and Innovation (Minciencias) and the National Hydrocarbons Agency (ANH).

How to cite: Monterroza Montes, M. A., San Martín Cañas, S., Lora-Ariza, B., and Donado, L. D.: AI-assisted Remote Sensing Screening of Potential Natural Hydrogen Seepage Features in Alta Guajira, Northern Colombia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15913, https://doi.org/10.5194/egusphere-egu26-15913, 2026.

EGU26-16953 | ECS | Orals | GI2.1

Local Similarity-Driven Refinement for Model-Agnostic Ground-Based Cloud Detection 

Yangfan Hu, Pinglv Yang, Zeming Zhou, Ran Bo, Shuyuan Yang, and Guangyang Zhang

Cloud cover estimation is of crucial significance in meteorological observations and short-term/long-term weather forecasting, as it directly affects the accuracy of radiation balance assessment, precipitation prediction, and climate change modeling. Ground-based automated cloud quantification observation instruments enable continuous, high-resolution cloud monitoring with spatial-temporal continuity that satellite remote sensing cannot fully achieve, highlighting the immense value of ground-based cloud image processing for practical meteorological applications. However, existing cloud detection methods predominantly rely on supervised training with ground truth masks, which overlook the rich contextual information and inherent regularization constraints embedded in original cloud images. This oversight frequently results in mismatched cloud boundaries, inadequate model interpretability, and poor adaptability to complex cloud morphologies—particularly for thin clouds and cirrus clouds characterized by weak grayscale contrast, sparse texture, and irregular shapes. Consequently, these limitations lead to suboptimal detection performance, including under-segmentation or over-segmentation, and further induce inaccuracies in quantitative cloud cover estimation.

To address the aforementioned issues and achieve accurate cloud cover detection results, this study proposes a model-agnostic refinement method designed to optimize the coarse detection masks generated by any pre-trained cloud detection model. The framework is jointly optimized by three loss functions: a local similarity descriptor, total variation (TV) regularization, and a traditional detection loss (e.g., cross-entropy). Specifically, the local similarity descriptor is defined as the difference between two terms: the average grayscale difference of each pixel and cloud region and background pixels within a local window. This descriptor effectively enhances the discriminability between cloud and non-cloud regions at the local level. The total variation regularization term is introduced to maintain the smoothness of the detection boundary and suppress spurious noise. The cross-entropy loss ensures the overall consistency between the refined result and the ground truth.

Minimizing the combined loss function drives the coarse detection result to evolve adaptively along the actual cloud boundary, thereby achieving more precise alignment with the true cloud contours. Notably, the proposed framework elevates the detection of thin clouds and cirrus clouds, effectively mitigating missed detection areas in these tenuous cloud structures. Furthermore, the integrated loss function enhances model interpretability: the local similarity descriptor explicitly quantifies the differences within local window, and minimizing this term inherently refines the detection by strengthening the distinction between cloud and background regions. Ultimately, the refined detection results substantially improve the accuracy of cloud cover estimation, laying a solid foundation for reliable meteorological observations and weather forecasting applications.

How to cite: Hu, Y., Yang, P., Zhou, Z., Bo, R., Yang, S., and Zhang, G.: Local Similarity-Driven Refinement for Model-Agnostic Ground-Based Cloud Detection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16953, https://doi.org/10.5194/egusphere-egu26-16953, 2026.

Earth Observation (EO) is an essential source of information for most geosciences. However, high costs, large data volumes, and difficult access constrained its use for decades. Open data programs like Copernicus have reduced costs, and cloud access via the Copernicus Data Space Ecosystem (CDSE) has made local processing largely obsolete. In fact, API (Application Programming Interface)-based cloud access, analysis-ready mosaics and calibrated Copernicus Land Monitoring Service data products have made Sentinel data AI-ready. But despite these advances, the requirement for complex programming skills remained a significant barrier until recently. Here, we demonstrate how cloud-native processing APIs and generative artificial intelligence (AI) are removing this obstacle by enabling the "vibe coding" paradigm shift. Vibe coding is an approach to software development where the researcher focuses on the high-level logic, the functional vision, and the end product, while the syntax and code are generated and refined by AI.
Copernicus Data Space Ecosystem facilitates this transition through three key features: (1) the abstraction of EO analysis pipelines via RESTful APIs, which reduces tasks to a series of mathematical operations on pixel values; (2) the availability of intuitive web browser visualization for rapid prototyping and debugging; and (3) an extensive body of open documentation and code examples that serve as a robust training foundation for generative AI.
On CDSE, the Sentinel Hub API family utilizes "custom scripts" (or "evalscripts") — modular JavaScript files defining data inputs, outputs, calculations, and visualizations. The openEO API uses "process graphs", JSON representations of the processing steps in a unified structure as a series of nodes. Because the backend manages big data optimization and the browser handles rendering, these scripts are concise enough for AI assistants to generate, adapt, and debug effectively. The Sentinel Hub Custom Script Repository, containing over 200 community-contributed scripts, and the openEO community examples repository and CDSE "Algorithm Plaza" have laid the foundation for this approach. Neither of these advances was intentionally created to support AI, but rather to simplify programming for humans; however, combined, they enable a breakthrough in code development. We demonstrate how AI tools can efficiently adapt scripts across different satellite sensors, combine spectral indices into decision trees, and produce scalable quantitative outputs. This allows researchers not specialized in remote sensing to utilize existing code modules and natural language prompts to create meaningful results for their specific fields. Beyond the capabilities of Sentinel Hub, OpenEO supports joint analysis of data from multiple back-ends and the application of user-defined external code, such as biophysical models or pre-trained ONNX deep learning networks. While this added complexity presents a higher technical threshold, it also creates a massive opportunity for AI-driven automation. Ultimately, in combination with the public data space approach, generative AI further democratizes Earth Observation, transforming it from a specialist-only domain into an integrated component of all geoscience research workflows.

How to cite: Zlinszky, A.: From natural language to quantitative satellite imagery analysis: Copernicus Data Space Ecosystem and AI enable vibe coding of custom scripts, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18394, https://doi.org/10.5194/egusphere-egu26-18394, 2026.

EGU26-18939 | ECS | Posters on site | GI2.1

Evaluating Fractional Vegetation Cover using Multimodal Large Language Models: A Comparative study with Human Observations 

Omar A. Lopez Camargo, Mariana Elias Lara, Marcel El Hajj, Hua Cheng, Dario Scilla, Victor Angulo, Areej Al wahas, Kasper Johansen, and Matthew F. McCabe

Fractional Vegetation Cover (FVC) is a key ecological variable for monitoring ecosystem health, land degradation, and vegetation dynamics in dryland environments. While satellite and UAV observations enable scalable FVC estimation over large spatial extents, the accuracy and robustness of these models remain strongly dependent on high-quality field-based reference data for calibration and validation. Traditional in-situ methods, including visual estimates using transect-based surveys, remain widely used but are labor-intensive and inherently subjective. Digital photography has emerged as a practical alternative, typically analyzed using index-based computer vision techniques or deep learning models. However, these methods are highly sensitive to background variability and therefore rely on massive labeled datasets. Recent advances in multimodal large language models (MLLMs) suggest a potential paradigm shift, as these models combine visual perception with high-level reasoning and benefit from diverse pre-training that enables conceptual knowledge transfer across tasks. In this study, we evaluate the feasibility of using MLLMs for direct estimation of FVC from ground-level photographs without task-specific training. We collected and compiled a dataset of more than 1,100 quadrat pictures from across 26 dryland sites in Saudi Arabia, spanning a wide range of surface conditions from bare soil to sparsely vegetated rangelands. Each picture corresponded to a 1 m × 1 m quadrat with FVC estimated independently by two experts, whose average was used as reference data for assessment of model predictions. Six state-of-the-art multimodal large language models, including Qwen2.5-VL, Mistral-Small-3.2, LLaMA-4-Maverick, LLaMA-4-Scout, and two Gemma-3 variants, were evaluated using four prompt designs that varied in length, ecological context, and methodological detail. Across all models and prompts, MLLMs achieved a mean absolute error of approximately 7.8%, demonstrating competitive performance relative to traditional image-based methods. The best-performing model-prompt combinations achieved mean absolute error values below 5%, with low systematic bias. Short and ecologically explicit prompts consistently outperformed more complex prompt designs, achieving an average reduction in mean absolute error (MAE) of approximately 1.3–1.4 percentage points compared to visually guided or highly structured prompts (MAE ≈ 6.9% versus 8.2–8.4%). Overall performance was more sensitive to model choice than to prompt structure, with mean MAE varying from approximately 5.6% to 10.0% across models, compared to a narrower range across prompts. The highest accuracy was obtained using the Qwen2.5-VL model with an ecologically detailed prompt, which achieved a mean absolute error of 4.9%, near-zero bias, and an RMSE of 8.4%. Across all prompt designs, Qwen2.5-VL and Mistral-Small-3.2 consistently delivered the best overall performance, both maintaining mean MAE values below 6% and exhibiting stable behavior across prompt variations, indicating robustness to prompt design. These results demonstrate that MLLMs can provide accurate and scalable FVC estimates directly from field photographs, without requiring specialized training datasets. This approach offers a promising alternative for rapid field surveys and reference data generation, particularly in dryland ecosystems where background complexity and data scarcity limit the effectiveness of conventional methods.

How to cite: Lopez Camargo, O. A., Elias Lara, M., El Hajj, M., Cheng, H., Scilla, D., Angulo, V., Al wahas, A., Johansen, K., and McCabe, M. F.: Evaluating Fractional Vegetation Cover using Multimodal Large Language Models: A Comparative study with Human Observations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18939, https://doi.org/10.5194/egusphere-egu26-18939, 2026.

Earth system is characterized by intricate interactions between human activities and natural processes, where stochastic dynamics, nonlinear feedbacks, and emergent behaviors collectively determine system evolution and sustainability outcomes. Despite significant advances in Earth system science, two fundamental challenges persist: the insufficient integration of physical process models with observational data, and the lack of interpretable frameworks for simulating coupled human-Earth dynamics and optimizing governance strategies. These limitations critically impede our ability to conduct effective Earth system governance and guide human-environment interactions toward sustainable development pathways. To overcome these challenges, this study proposes an innovative framework that synergistically integrates data assimilation and reinforcement learning to enhance both predictability and decision-making capabilities in the complex Earth system. Data assimilation, as a well-established methodology in Earth system science, systematically combines dynamic models with multi-source observations to improve system observability and forecast accuracy. Reinforcement learning, grounded in the Bellman equation and Markov decision processes, provides a natural paradigm for modeling adaptive human-environment interactions and deriving optimal strategies through sequential decision-making under uncertainty. Building upon these complementary methodologies, we develop a Multi-Agent Deep Reinforcement Learning (MADRL) framework that employs the Markov decision process as the theoretical foundation, integrates agent-based modeling to represent heterogeneous stakeholder behaviors across multiple organizational levels, utilizes deep neural networks to handle high-dimensional state-action spaces, and incorporates data assimilation techniques to continuously update system states and reduce forecast uncertainties. This integrated framework is specifically designed to address fundamental Earth system governance challenges by capturing emergent phenomena arising from complex human-environment interactions, enabling the exploration of intervention mechanisms such as economic incentives, regulatory policies, and cooperative arrangements, and providing interpretable decision pathways that balance economic development with environmental sustainability. Through this integration, our framework offers a systematic approach to tackle classical problems in Earth system governance, from the tragedy of the commons to planetary boundaries, ultimately advancing our capacity to navigate toward sustainable development trajectories in an increasingly coupled human-Earth system.

How to cite: Yuan, S. and Li, X.: Generalizing human-Earth systems modeling and decision-making: A multi-agent deep reinforcement learning framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19409, https://doi.org/10.5194/egusphere-egu26-19409, 2026.

EGU26-19427 | Posters on site | GI2.1

Improving the seismic catalogue completeness of Tenerife (Canary Islands, Spain) through deep learning 

Manuel Calderón-Delgado, Luca D’Auria, Aarón Álvarez-Hernández, Rubén García-Hernández, Víctor Ortega-Ramos, David M. van Dorth, Sergio de Armas-Rillo, Pablo López-Díaz, and Nemesio M. Pérez

The volcanic island of Tenerife (Canary Islands, Spain) is characterized by low-magnitude background seismicity associated with local hydrothermal and volcano-tectonic processes. The island has been experiencing, since 2016, a slight increase in seismic activity, with earthquakes generally having magnitudes below 2. For this reason, we are revising the seismic catalogue using deep learning tools to improve its completeness.

Over the last decade, machine learning methods—particularly deep learning approaches—have gained traction across multiple disciplines due to their increased computational efficiency, high accuracy, and reduced need for manual supervision. One such method, PhaseNet [1], is a deep convolutional neural network based on the U-Net architecture [2] that has shown strong performance in waveform-based seismic phase detection. Its ability to process large volumes of seismic data and automatically identify relevant signal features represents a significant opportunity to enhance the quality and completeness of seismic catalogs. Nevertheless, applying a neural network to data with a different nature from that used for its training phase can lead to a substantial decrease in performance. In particular, PhaseNet was primarily trained on tectonic seismicity, whereas seismic events in Tenerife are predominantly volcanic-hydrothermal. Consequently, retraining the network on waveforms representative of the target seismicity is essential to ensure a reliable inference.

Using PhaseNet as a baseline, we conducted an extensive comparative analysis of several training configurations to adapt the original network to the seismic data from the Canary Islands (Tenerife). Our study focused on four key aspects: model initialization, learning rate selection, data clustering strategies, and model partitioning. The model initialization strategies include fine-tuning from pre-trained weights and training from randomly initialized weights. Regarding model partitioning, we evaluated a global model (a single model trained on all data), local models (one model per station), and cluster-based models (trained on groups of stations with similar characteristics). The performance of each configuration was evaluated on an independent dataset using multiple metrics to provide a comprehensive assessment. Specifically, we analyzed precision, recall, and ROC curves to identify suitable trade-offs between detection sensitivity and specificity.

These preliminary results will be beneficial for subsequent analysis aimed at a better characterization of the island's microseismicity and its relationship with the activity of its volcanic-hydrothermal system.

References:

  • [1] Zhu and G. C. Beroza, “PhaseNet: a Deep-Neural-Network-Based seismic arrival time picking method,” Geophysical Journal International, Oct. 2018, doi: 10.1093/gji/ggy423.
  • [2] O. Ronneberger, P. Fischer, and T. Brox, “U-NET: Convolutional Networks for Biomedical Image Segmentation,” in Lecture notes in computer science, 2015, pp. 234–241. doi: 10.1007/978-3-319-24574-4_28.

 

How to cite: Calderón-Delgado, M., D’Auria, L., Álvarez-Hernández, A., García-Hernández, R., Ortega-Ramos, V., M. van Dorth, D., de Armas-Rillo, S., López-Díaz, P., and M. Pérez, N.: Improving the seismic catalogue completeness of Tenerife (Canary Islands, Spain) through deep learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19427, https://doi.org/10.5194/egusphere-egu26-19427, 2026.

EGU26-19439 | ECS | Posters on site | GI2.1

Onboard Hybrid Orbit Prediction with Lightweight Machine-Learning Error Correction 

Benedikt Aigner, Fabian Dallinger, Thomas Andert, and Benjamin Haser

Autonomous spacecraft operations are increasingly important as missions grow more complex, ground contact opportunities remain limited, and the number of LEO satellites continue to rise. Reliable onboard orbit determination (OD) and orbit prediction (OP) are essential for mission planning, resource allocation, and communication scheduling. Operational OD/OP typically relies on physics-based models that estimate parameters (initial state, drag coefficient, etc.) from tracking data. However, environmental modeling is not perfect, and uncertainties in atmospheric density can cause prediction errors to grow rapidly. This limits OP reliability.

We present an onboard-oriented hybrid OD/OP concept that augments a classical physics-based OD/OP chain with a lightweight machine-learning (ML) correction module to compensate for systematic OP errors in real time. While data-driven correction of propagator errors has been explored previously, this work emphasizes the tight integration of a compact correction model into an operational workflow under onboard constraints. The implementation is based on the Python OD/OP toolbox Artificial Intelligence for Precise Orbit Determination (AI4POD) and targets deployment within the Autonomous Space Operations Planner and Scheduler (ASOPS) experiment, that is planned for validation on the ATHENE-1 satellite.

The approach is demonstrated using simulated GPS-like tracking data generated with a high-fidelity reference model, while OD/OP are performed with a reduced-complexity model representative of onboard settings. A compact artificial neural network (ANN) is trained to predict OP errors in the RSW frame from available onboard data, reducing the maximum three-day along-track error from ~5 km to ~1.2 km.

To assess operational robustness, we complement the baseline results with a statistical consistency check of the residuals across all prediction cases and outline planned tests with additional ML/DL correction models.

How to cite: Aigner, B., Dallinger, F., Andert, T., and Haser, B.: Onboard Hybrid Orbit Prediction with Lightweight Machine-Learning Error Correction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19439, https://doi.org/10.5194/egusphere-egu26-19439, 2026.

EGU26-19927 | ECS | Posters on site | GI2.1

Single- vs. Multilayer Physics-Informed Extreme Learning Machines for Orbit Determination 

Fabian Dallinger, Benedikt Aigner, Thomas Andert, and Benjamin Haser

Orbit Determination (OD) is commonly addressed with classical estimators such as Weighted Least Squares, which are statistically well founded but can be sensitive to poor initialization and may degrade when the initial state is weakly known. Physics-Informed Machine Learning offers an alternative by embedding orbital dynamics directly into the estimation process. In this work, Physics-Informed Extreme Learning Machines (PIELMs) are investigated as fast OD models that do not require a high-quality initial guess, since the output layer is obtained from a physics-based training objective that enforces consistency with both measurements and dynamics.

While single-layer PIELMs can achieve high accuracy, they may exhibit reduced stability in regimes with limited measurement support. To improve representational capacity and generalization, the Deep PIELM augments the model with an autoencoder-based feature hierarchy that is pretrained efficiently via the Moore–Penrose pseudoinverse, followed by physics-informed nonlinear least-squares optimization of the final layer.

Comparative results highlight the trade-offs among classical least squares, single-layer PIELM, and Deep PIELM in terms of OD accuracy, robustness under poor initialization, and computational efficiency under sparse optical and range measurements from a limited set of ground stations. For suitable hyperparameter configurations, the multilayer architecture provides improved stability and accuracy over the single-layer variant while retaining low training times, positioning Deep PIELMs as an effective complement to classical least-squares OD when robust performance without reliable initial guesses is required. The presented work is part of the Artificial Intelligence for Precise Orbit Determination project.

How to cite: Dallinger, F., Aigner, B., Andert, T., and Haser, B.: Single- vs. Multilayer Physics-Informed Extreme Learning Machines for Orbit Determination, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19927, https://doi.org/10.5194/egusphere-egu26-19927, 2026.

EGU26-20311 | ECS | Orals | GI2.1

Performance Comparison of Some Artificial Intelligence Algorithms for Metallic Mineral Deposits: A Case from Türkiye 

Gizem Karakas, Bahunur Civci, Birgul Topal, Candan Gokceoglu, Ahmet Ozcan, Cagri Erbasli, F. Sumeyye Cebeloglu, Murat Koruyucu, and Banu Ebru Binal

Recent advances in artificial intelligence and geospatial data analytics have led to an increasing adoption of data-driven approaches in the identification and prediction of mineral deposits. Traditional mineral exploration methods often rely on single data sources or expert-driven interpretations and may therefore be inadequate in regions where geological information is limited or spatially complex. In contrast, artificial intelligence–based approaches enable the quantitative assessment of mineral potential and the identification of spatial patterns associated with mineralization by jointly integrating multi-source geological, geophysical, and remote sensing data. Therefore, the comparative evaluation of different artificial intelligence algorithms using approaches that account for spatial dependence is critical for selecting reliable and interpretable models in early-stage mineral exploration conducted under data-limited conditions.

This study focuses on a comparative evaluation of artificial intelligence algorithms for predicting potential iron (Fe) mineralization under limited geological data conditions in a region with metallic mineralization potential in Türkiye. The study area covers approximately 2,340 km². A total of seven predictor variables were incorporated into the modeling, classified into geological (lithology, geological age, formation type), structural (fault density), geophysical (magnetic anomaly and gravity-tilt features), and remote sensing–based datasets (iron oxide potantial zones derived from ASTER imagery). The mineralization inventory is highly sparse, comprising only 15 iron occurrences and 24 non-iron reference points selected by geologists To address this limitation, a spatially aware hard negative mining strategy was applied, in which negative samples were preferentially selected from areas spatially proximal to known mineralization occurrences. Model performance was evaluated using GroupKFold-based spatial cross-validation to minimize bias arising from spatial autocorrelation, within which the Random Forest (RF) and XGBoost (XGB) algorithms were compared. The obtained results show that the RF and XGB models achieved mean Area Under Curve (AUC) values of 0.85 and 0.89, respectively. According to the generated mineral prospectivity maps, the Random Forest model delineates approximately 207.02 km² of high-potential areas (probability ≥ 0.90), while the XGBoost model identifies high-potential areas covering approximately 404.04 km² at the same probability threshold. These results indicate that there are pronounced differences in the spatial distribution of high-potential areas depending on the algorithm used. Additionally, the feature importance analysis revealed that geological age, magnetic anomaly, formation type, and gravity-tilt features are the primary controlling factors influencing the spatial distribution of iron mineralization.

This study outcomes revealed the importance of algorithm selection and spatially aware validation strategies in artificial intelligence–based mineral exploration. The findings indicate that reliable mineral prospectivity assessments can be achieved even under limited geological data conditions. Furthermore, in early-stage exploration programs, these approaches strengthen effective target area prioritization and decision-support processes and contribute to cost reduction through more efficient planning of exploration activities.

How to cite: Karakas, G., Civci, B., Topal, B., Gokceoglu, C., Ozcan, A., Erbasli, C., Cebeloglu, F. S., Koruyucu, M., and Binal, B. E.: Performance Comparison of Some Artificial Intelligence Algorithms for Metallic Mineral Deposits: A Case from Türkiye, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20311, https://doi.org/10.5194/egusphere-egu26-20311, 2026.

EGU26-20838 | ECS | Orals | GI2.1

AI-Based Quantification of Crack Geometry on Retaining Walls from Mobile Earth-Observation Imagery 

Yen-Chun Chiang, Shao-Chin Chu, and Guan-Wei Lin

Cracks on retaining walls and road surfaces can reveal the early warning signs of geohazards such as landslides or slumps in rural areas. However, even today, many governments still rely on manual visual inspection to identify and evaluate cracks, which is time-consuming, subjective, and highly dependent on individual experience. Artificial intelligence (AI) applied to Earth-observation imagery not only enables the detection of potentially dangerous cracks but also makes it possible to quantify their geometric properties, providing a more objective and quantitative basis for infrastructure monitoring and geohazard risk management.

Nevertheless, several key challenges remain. First, although recent studies have developed many advanced algorithms for crack detection and segmentation, methods for measuring crack width, length ,and area are still insufficient. Second, most existing models are designed for road cracks, while cracks on retaining walls present more complex textures, illumination conditions, and background noise, requiring dedicated model fine-tuning. Third, in regions with dense vegetation, branches, leaves, and shadows often produce false detections, making it difficult for AI models to distinguish real cracks from environmental interference.

In this study, we aim to quantify crack geometry from mobile panoramic Earth-observation imagery and to develop an AI model optimized for cracks on retaining walls in complex environments. A multi-stage approach is used to combine YOLO-based crack detection with 3D geospatial information for estimating the length, width, and area of individual cracks. By focusing on real cracks under vegetation-rich and noisy conditions, this approach advances AI-based quantitative analysis of surface degradation. These crack metrics provide a foundation for future retaining wall stability assessment and risk-informed infrastructure management.

How to cite: Chiang, Y.-C., Chu, S.-C., and Lin, G.-W.: AI-Based Quantification of Crack Geometry on Retaining Walls from Mobile Earth-Observation Imagery, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20838, https://doi.org/10.5194/egusphere-egu26-20838, 2026.

EGU26-22777 | Posters on site | GI2.1

AETHER: AI Enhancement for Third-gen Earth observing ImageR. Reaching 3x spatial upsampling and 10x temporal upsampling from existing MTG-I products. 

Nicolas Dublé, Sylvain Tanguy, Lucas Arsene, Vincent Poulain, Danaele Puechmaille, Oriol Hinojo Comellas, and Miruna Stoicescu

The Meteosat Third Generation (MTG) mission represents a major step forward in geostationary meteorological observation by combining, onboard Meteosat-12, multiple instruments with highly complementary characteristics. Among them, the Flexible Combined Imager (FCI) provides multispectral images of the full Earth disk every ten minutes with a spatial resolution reaching 1 km at nadir, while the Lightning Imager (LI) observes the same scene at a much higher temporal sampling, but with a coarser spatial resolution of approximately 4.5 km at nadir. Although designed for distinct operational purposes, these two sensors offer a unique opportunity for joint exploitation, as they observe identical atmospheric phenomena under fundamentally different spatio-temporal trade-offs. In this context, Thales investigates the use of artificial intelligence techniques to leverage this complementarity and generate enhanced observation products from existing MTG-I data. 

The core hypothesis of this work is that the high temporal density of LI observations implicitly encodes fine-scale spatial information. In other words, temporal correlations within LI time series can partially compensate for the sensor’s lower spatial resolution. By exploiting these correlations, fine spatial features can be reconstructed from high temporal frequencies. The availability of reference matching high resolution data enables to consider this process without the need for artificially degraded training data. 

To implement this hypothesis, a hybrid deep learning architecture combining convolutional neural networks (CNNs) and Transformers is proposed. CNN components are used to efficiently extract local spatial structures, such as gradients, cloud edges, and internal texture patterns, while Transformer-based attention mechanisms model short- and long-range temporal dependencies across successive LI acquisitions. This combination enables a joint representation of spatial detail and temporal coherence, while remaining compatible with large data volumes and near-operational processing constraints. 

The proposed approach is evaluated along two complementary scientific tasks. The first focuses on spatial super-resolution of LI images using LI temporal sequences alone. The second addresses the fusion of FCI and LI data to generate a product combining high spatial resolution with high temporal frequency. In both cases, the results are conclusive. The use of FCI images as a cross-reference makes it possible to assess the physical consistency of reconstructed features and to prevent the introduction of spurious, non-physical details. The super-resolved products remain radiometrically consistent with the input observations, with low radiance discrepancies (RMSE below 1), while recovering finer spatial structures than those achievable through conventional interpolation methods. Compared to standard SISR (Single Image Super Resolution), CNN + Temporal Conv1D, CNN + sparse Conv3D approaches, the hybrid CNN–Transformer model achieves the best overall performance. 

As a perspective, the proposed method shows strong potential for operational deployment. Its computational efficiency allows approximately one hour of MTG data—corresponding to about sixty full-disk Earth images—to be processed in less than five minutes on standard computing infrastructure with one Nvidia H-100 configuration, paving the way for the routine generation of high-resolution, high-frequency products from existing geostationary missions. 

How to cite: Dublé, N., Tanguy, S., Arsene, L., Poulain, V., Puechmaille, D., Hinojo Comellas, O., and Stoicescu, M.: AETHER: AI Enhancement for Third-gen Earth observing ImageR. Reaching 3x spatial upsampling and 10x temporal upsampling from existing MTG-I products., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22777, https://doi.org/10.5194/egusphere-egu26-22777, 2026.

EGU26-2564 | Orals | ESSI1.18

Physics-Aware Hybrid Deep Visual-Inertial Odometry Based on Graph Attention Networks for GNSS-denied Environment 

Yubing Jiao, Shijie Liu, Changjiang Xiao, Wei Ouyang, and Xiaohua Tong

In GNSS-denied deep space exploration missions, high-precision state estimation and navigation positioning are critical to ensuring the successful completion of complex mission objectives. However, the environmental characteristics of extraterrestrial surfaces, such as drastic illumination changes, monotonous textures, and sparse features, often lead to the failure of traditional visual navigation systems. Meanwhile, IMUs, despite their high-frequency and anti-interference capabilities, face the challenge of integration error accumulation caused by biases and noise. Although Visual-Inertial Odometry (VIO) achieves complementary advantages through multi-source fusion, existing end-to-end deep learning methods often lack explicit physical modeling, this deficiency leads to a sharp degradation in generalization performance and susceptibility to drift in extreme environments, thereby failing to meet the stringent standards required for aerospace-grade missions.

To address the extreme environments of extraterrestrial bodies and the limitations of existing methods regarding the lack of physical consistency and insufficient generalization, we propose a Physics-Aware Hybrid Deep Visual-Inertial Odometry (PDVIO) navigation method suitable for extraterrestrial bodies, this framework is dedicated to deeply coupling physics-driven kinematic priors with data-driven deep representation capabilities to construct a navigation system that possesses both strong robustness and high precision. Specifically, this study comprises three core contributions: First, addressing the integration drift caused by IMU noise, we designed an analytical physical pre-integration module based on Lie Group Theory, unlike traditional networks that directly regress pose parameters, this module explicitly constructs IMU motion differential equations on the SE(3) manifold, embedding hard rigid body dynamic constraints directly into the network structure, thereby substantially reducing the risk of model divergence in extreme environments. Second, to cope with visual perception degradation caused by high-dynamic illumination changes and sparse textures, we introduce a FlowNet-enhanced multi-scale feature encoder, by extracting hierarchical spatiotemporal optical flow features via a pyramid structure, this enables the system to effectively capture ego-motion states based on optical flow field consistency even in regions with extreme textures, significantly enhancing the stability of front-end tracking. Finally, addressing the drawback of traditional methods relying on fixed noise covariance, we propose a differentiable factor graph back-end framework based on Graph Attention Networks (GAT). Utilizing an attention mechanism to dynamically learn the confidence weights of visual and inertial modalities according to the real-time dynamic environment, this successfully achieves adaptive end-to-end joint optimization from feature extraction to state estimation, greatly improving the system's adaptability and navigation accuracy in complex deep space environments.

Experiments conducted on simulation datasets and real-world ground data demonstrate that, while maintaining the efficiency of deep learning feature extraction, this method significantly enhances the robustness and generalization capability of the navigation system, specifically, the trajectory estimation error is markedly reduced compared to traditional end-to-end models, effectively mitigating long-term integration drift. Therefore, this study not only validates the effectiveness of embedding physical priors into deep learning frameworks, addressing the issues of insufficient robustness and limited autonomy inherent in purely data-driven methods within aerospace scenarios, but also provides a highly reliable and high-precision navigation solution for future planetary exploration missions involving precise pinpointing and navigation.

How to cite: Jiao, Y., Liu, S., Xiao, C., Ouyang, W., and Tong, X.: Physics-Aware Hybrid Deep Visual-Inertial Odometry Based on Graph Attention Networks for GNSS-denied Environment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2564, https://doi.org/10.5194/egusphere-egu26-2564, 2026.

EGU26-2757 | ECS | Posters on site | ESSI1.18

AI-driven analysis of dangerous space weather: finding dominant modes in space-based measurements 

Maria Hasler, John Coxon, and Andy Smith

A specific aspect of space weather that remains poorly understood is the exchange of information from space to the ground through the ionosphere. A central component of this process involves understanding how current systems such as field-aligned currents transfer energy and momentum between the magnetosphere and the ionosphere. However, the non-linear behaviour of these current systems poses significant challenges for identifying the drivers of ionospheric currents and understanding the inner dynamics of the ionosphere itself.
To tackle these complexities and their effects on the ground, we adopt a data-driven approach using space-based observations from the Active Magnetosphere and Planetary Electrodynamics Response Experiment (AMPERE). Specifically, we focus on gaining insights into what drives these current systems by finding underlying statistical patterns (dominant modes) in the data using unsupervised machine learning methods. We employ techniques such as β - Variational Autoencoders (β-VAEs), which have been proven useful in identifying patterns in unlabelled observational data.
We extract dominant modes and connect them to physical drivers of the system with a two-step approach. First, we quantify model performance using a physically motivated goodness-of-fit metric to ensure that the learned model reconstructions capture the essential dynamics in the current system. Second, we analyse the model’s latent space, representing a compressed representation of the high dimensional input data. We then analyse the latent space and connect the influence of the individual latents to physical drivers of the system through the usage of the OMNI dataset. This approach enables a systematic interpretation of the model’s internal representations in terms of underlying physical processes.

How to cite: Hasler, M., Coxon, J., and Smith, A.: AI-driven analysis of dangerous space weather: finding dominant modes in space-based measurements, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2757, https://doi.org/10.5194/egusphere-egu26-2757, 2026.

Both observations and simulations have revealed that magnetic reconnection occurs at thin current sheets within the transition region of collisionless shock waves. These ion- and electron-scale structures arise from stream instabilities and turbulence in the shock layer, contribute significantly to repartition of energy across the shock, and propagate far into the downstream region. In a recent study [Gingell et al. 2023, Physics of Plasmas, 30, 0123902], a series of 2D hybrid particle-in-cell simulations were used to explore the shock-driven generation and decay of reconnecting structures over a broad range of parameters. Magnetic field line integration was used to quantify reconnection in each simulation, classifying each cell in the domain as having “closed” or “open” magnetic field topology. Here, we use these classifications to train a convolution neural network (CNN) to identify regions of the simulation that are undergoing (or have undergone) magnetic reconnection. This is performed by splitting each simulation domain into a series of 1D virtual trajectories, with a view to creating a dataset equivalent to a series of in situ observations. We find that the trained CNN is able to effectively identify structures of interest in simulations that have different plasma and shock parameters to the training data set, as well as in those with different dimensionality (i.e. 3D simulations). Further, we present a pipeline for applying this simulation-trained CNN to in situ observations of shocks by the Magnetospheric Multiscale and Solar Orbiter spacecraft, and demonstrate successful detection of reconnection sites embedded in the shock layer. We discuss these techniques more generally as a case study for using machine learning to identify structures of interest in spacecraft data, which may contribute to on-board event selection for burst modes in spacecraft with relatively limited downlink capacity.

How to cite: Gingell, I. L.: Connecting hybrid plasma simulations of collisionless shockwaves to in situ observations with machine learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3944, https://doi.org/10.5194/egusphere-egu26-3944, 2026.

EGU26-4379 | ECS | Posters on site | ESSI1.18

Texture based classification of geological materials using Deep Learning - Proof of concept for Planetary Surface Analysis 

Siddhant Shrivastava, Aswathy Rema, Sanjeev Kumar, and Mohinder Pal Singh Bhatia

Recent studies in machine learning (ML) and geology have demonstrated a strong potential for automated classification of rocks and minerals. Though, the performance of ML models like Convolutional Neural Networks (CNNs), for pattern recognition of geological textures remains limited under controlled microscopic imaging conditions. This study explores the possibility of automated classification of multiple rocks and minerals including visually similar samples using microscopic texture information.

Initially, microscopic images of terrestrial basalt and magnetite which are visually similar under RGB microscopy, were captured using a digital USB microscope under varying illumination and magnification settings. These materials were selected to evaluate the performance of CNN models on differences in grain size, crystallinity and surface reflectance. A dataset comprising 2500 images per class was created and expanded using several augmentation techniques to increase the robustness of the model. With transfer learning, multiple models were trained amongst which InceptionV3 model achieved the highest validation accuracy for the initial binary classification problem.

The trained model achieved a validation accuracy of 98.30% and a test accuracy of 95%, demonstrating strong generalization capabilities. To assess the model’s effectiveness, performance metrics such as Precision, F1-Score, Confusion Matrix and ROC curve were examined. These findings provide insight into the strengths of CNN based pattern recognition in geological applications and demonstrate how deep learning techniques can be used for automated texture based classification.

Also, while this study does not directly utilize planetary datasets, it establishes a foundation for future applications of texture based ML methods in autonomous rover operations for geological analysis. We aim to extend this study to multiple basaltic variants and lithological classes under conditions relevant to Martian exploration, for building robust ML algorithms which can be used for geological image analysis.

How to cite: Shrivastava, S., Rema, A., Kumar, S., and Bhatia, M. P. S.: Texture based classification of geological materials using Deep Learning - Proof of concept for Planetary Surface Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4379, https://doi.org/10.5194/egusphere-egu26-4379, 2026.

EGU26-5189 | Orals | ESSI1.18

AI-Based Coronal Hole Detection and Solar Wind Model Validation 

Kalpa Harindra Perera Henadhira Arachchige, Barbara Perri, Allan-Sacha Brun, Antoine Strugarek, Eric Buchlin, Victor Reville, and Marie Ausseresse

The properties and the spatial distribution of the large-scale structures of the Solar Corona (SC) determine the observed solar wind structure at 1 AU. Coronal Holes (CHs) are the primary source of the fast solar wind, which is the most geoeffective component of solar wind, and they appear as large dark patches in the Extreme Ultraviolet (EUV) images from the Atmospheric Imaging Assembly (AIA) on the Solar Dynamic Observatory (SDO) and the Extreme Ultraviolet Imaging Telescope (EIT) on the Solar and Heliospheric Observatory (SoHO). These observatories provide images of the SC at different wavelengths, which enables the identification of CH morphology and other large-scale structures along a given line of sight. It is crucial to understand the CH regions and their properties for effective space weather forecasting. This work is part of the WindTRUST project, with the primary goal of improving the reliability of solar wind models for space weather forecasting. Here, we aim to develop an automatic threshold-based CH detection tool for predictions across solar cycles 23, 24, and 25. We also plan to integrate this CH detection tool into a solar wind model validation pipeline, creating a fully automated validation system that provides a quantitative assessment of predictions. We categorized the large-scale features of the SC, such as active regions, solar flares, coronal mass ejections (CMEs), and filaments, based on their spatial distribution, phase of the solar cycle, and additional properties, including the GOES solar flare class. A Sequential Neural Network (NN) model was then trained by optimizing the architecture of the hidden layers to achieve higher predictive accuracy. The resulting model estimates the threshold required for integration into the Coronal Hole (CH) detection scheme, thereby enabling automated, consistent identification of CH boundaries in EUV images across solar cycles 23, 24, and 25. To interpret the performance of our NN model, we divided the predicted CH results into solar minimum and maximum cases across the solar cycles 23, 24, and 25. We also provide a comparison of our CH detection results with those obtained from other detection tools. Once we identify CH contours from our model, we validate them using a diagnostic test against CH contours from the Potential Field Source Surface (PFSS) model (non-MHD) and the WindPredict (WP) model (Polytropic and Alfven Wave) (MHD). Finally, we couple the CH detection tool with the validation pipeline to develop an automation tool for solar wind predictions.

How to cite: Henadhira Arachchige, K. H. P., Perri, B., Brun, A.-S., Strugarek, A., Buchlin, E., Reville, V., and Ausseresse, M.: AI-Based Coronal Hole Detection and Solar Wind Model Validation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5189, https://doi.org/10.5194/egusphere-egu26-5189, 2026.

EGU26-6050 | ECS | Orals | ESSI1.18

Probabilistic Solar Wind Estimation for Operational Space Weather Prediction at Mars 

Abigail Azari, Kelly Hayes, and Matthew Rutala

Unlike Earth, Mars does not possess an upstream solar wind monitor. This lack of continuous solar wind observations has fundamentally limited scientific studies that investigate solar wind impacts on the Mars space environment, and with increasing relevance, operational tasks for predicting space weather at the planet. Previous estimates of the solar wind have been pursued through physics-based modeling (e.g. magnetohydrodynamic models) or empirical (e.g. assuming statistical relationships with downstream observations) proxies. Proxies are often based on downstream observations from multiple orbiting spacecraft. These spacecraft pass in and out of the bow shock providing a semiregular sampling of the pristine solar wind. The most complete, and ongoing, set of the solar wind’s magnetic field and plasma parameters is from the NASA MAVEN spacecraft. MAVEN has orbited Mars since 2014, but additional assets add resolution to this dataset such as including ESA’s MEX mission which has been in orbit since 2003, the CNSA’s Tianwen-1 orbiter since 2021, and NASA’s ESCAPADE mission scheduled for orbital insertion in 2027.

In this presentation we will summarize a prior effort to create a continuous solar wind estimation upstream from Mars. This virtual solar wind monitor, or vSWIM (see Azari, Abrahams, Sapienza, Halekas, Biersteker, Mitchell, Pérez et al., 2024, doi: 10.1029/2024JH000155) was trained and assessed on MAVEN data with Gaussian process regression. Gaussian process regression, a type of machine learning, was used to provide predictions, and uncertainties on these predictions, at various temporal resolutions. vSWIM currently enables informed solar wind estimation at Mars for most of the time since 2014. We will then discuss current progress on improving vSWIM’s capacity for multi-spacecraft integration for enhanced operational space weather prediction efforts at Mars.

How to cite: Azari, A., Hayes, K., and Rutala, M.: Probabilistic Solar Wind Estimation for Operational Space Weather Prediction at Mars, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6050, https://doi.org/10.5194/egusphere-egu26-6050, 2026.

EGU26-6301 | ECS | Posters on site | ESSI1.18

Comparing analytical and machine learning heat flux closures 

Emanuel Jeß, Simon Lautenbach, Sophia Köhne, Rainer Grauer, and Maria Elena Innocenti

In many plasmas, physical processes of relevance occur over ranges of scales covering many orders of magnitude. Thus, modelling plasmas comes with a trade-off between physical accuracy and computational cost. Fully kinetic models correctly self-consistently describe collisionless plasmas by advancing the velocity distribution functions (VDFs) in time, either directly (Vlasov methods) or sampling it through computational particle (PIC codes). A computationally cheaper but physically less accurate alternative are multi-fluid models. Instead of the VDFs, these models evolve fluid quantities and can approximate kinetic processes of interest by choosing a suitable closure for the hierarchy of fluid moment equations, i. e., an equation for the divergence of the heat flux in the case of ten-moment fluid models. In most heliospheric plasmas, including for example the solar wind, the observed VDFs are non-Maxwellian, which gives rise to many different instabilities that exchange energy between particles and fields. We investigate the use of machine learning models for the discovery of heat flux closures, as an alternative to the typically employed Hammett-Perkins-like analytical closures. As a test case, we use the two-stream instability, which occurs when there is a large velocity drift between two electron populations with respect to their thermal speed, and causes the formation of electron holes and electric field saturation in its nonlinear stage. While the linear stage of the two stream instability is well reproduced by 10-moment models with analytical closures, reproducing electric field evolution at saturation is a challenge for reduced models. In this work, we compare fully kinetic Vlasov simulations against two-fluid 10-moment simulations employing both analytical and ML-driven closures.

How to cite: Jeß, E., Lautenbach, S., Köhne, S., Grauer, R., and Innocenti, M. E.: Comparing analytical and machine learning heat flux closures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6301, https://doi.org/10.5194/egusphere-egu26-6301, 2026.

Accurate prediction of rare but high-impact events is a recurring challenge in planetary science and heliophysics, where strongly imbalanced data distributions are common (e.g. extreme space-weather conditions). Standard empirical risk minimization tends to bias machine-learning models toward frequently observed regimes, often leading to poor performance on scientifically and operationally critical tail events. Existing mitigation strategies, such as loss re-weighting or synthetic over-sampling, have shown mixed and problem-dependent success.

We present PARIS (Pruning Algorithm via the Representer theorem for Imbalanced Scenarios), a data-centric framework that addresses imbalance by optimizing the training dataset itself rather than modifying the loss function or model architecture. PARIS exploits the representer theorem for neural networks to compute a closed-form representer deletion residual, which quantifies the change in validation loss induced by removing an individual training sample—without requiring retraining. Using an efficient Cholesky rank-one downdating scheme, this enables fast, iterative pruning of uninformative or performance-degrading samples.

We demonstrate PARIS on a real-world space-weather regression problem (Dst prediction), where it reduces the training set by up to 75% while preserving or improving overall RMSE and outperforming loss re-weighting, synthetic over-sampling, and boosting baselines. These results highlight representer-guided dataset pruning as a computationally efficient, interpretable, and physically relevant approach for improving rare-event regression in heliophysics and related planetary science applications.

Preprint: https://www.arxiv.org/abs/2512.06950

How to cite: Camporeale, E.: PARIS: Pruning Algorithm via the Representer theorem for Imbalanced Scenarios, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6702, https://doi.org/10.5194/egusphere-egu26-6702, 2026.

EGU26-6977 | ECS | Orals | ESSI1.18

LuNeRF: How Neural Radiance Fields Can Advance Very High Resolution Lunar Terrain Reconstruction 

Chloé Thenoz, Dawa Derksen, Jean-Christophe Malapert, and Frédéric Schmidt

Modeling the lunar terrain is a key challenge for lunar missions, having impact on mission planning, resource planning and the establishment of sustainable human bases on the Moon. Thanks to the Lunar Reconnaissance Orbiter (LRO) and its Narrow Acquisition Cameras (NACs) acquiring images at a spatial resolution of 50cm, a large collection of images are now available. Despite this, automatically generating digital elevation models (DEMs) on the Moon remains a challenge. Classic methods like multi-view stereovision or photoinclinometry struggle with lunar specificities such as the large shadows and the permanently shadowed regions (PSRs) and the absence of atmosphere, the complex lighting conditions and the homogeneity of the lunar surface texture.

In 2020, a new self-supervised neural-network-based method called Neural Radiance Fields (NeRF) was introduced and demonstrated outstanding 3D reconstruction capacities from multi-view images. Recent advancements adapted the methodology to the challenging field of satellite imagery of the Earth and exhibited competing or even better results than classic methodologies. Some recent works tried to transfer to the Moon but either constrained their studies to simulated data or rather reused existing models.

In this work, we explore the potential of NeRF to learn the 3D shape of the lunar surface at a very high resolution from LRO NACs data, supported by a coarse estimation of the ground given by processed data from LRO’s altimetric sensor called the Lunar Orbiter Laser Altimetry (LOLA). Our main contributions are the generation of a LRO NeRF-ready dataset on a Moon South Pole region that we intend to openly share and the development of a specific model coined LuNeRF. We demonstrate that, with an adapted radiance modeling, LuNeRF can recover the geometry of small craters, as well as perform novel view synthesis and relighting tasks.

How to cite: Thenoz, C., Derksen, D., Malapert, J.-C., and Schmidt, F.: LuNeRF: How Neural Radiance Fields Can Advance Very High Resolution Lunar Terrain Reconstruction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6977, https://doi.org/10.5194/egusphere-egu26-6977, 2026.

EGU26-9144 | ECS | Posters on site | ESSI1.18

Automatic Segmentation, Inpainting, and Tracking of CMEs By A Pixel-Annotation-Free System 

Yi Yang, Zhiyang Wang, and Fang Shen

Coronal mass ejections (CMEs), one of the most significant and intense solar eruptive activities, exert profound impacts on Earth and the interplanetary space environment. Consequently, prompt detection and tracking of CMEs are important for mitigating their impacts. Considering the complexity of manually annotating regions of CME on coronagraph images and the presence of anomalous data, we have developed a new automatic CME tracking system that does not rely on pixel-level annotations and can handle obvious data errors. The proposed system consists of three processes: error area segmentation and inpainting, CME segmentation, and CME tracking. All deep learning algorisms in our system are trained on the dataset without pixel-level labels, which can be easily constructed from publicly available CME catalogs. Moreover, by comparison with existing catalogs and methods, we demonstrate that the proposed system is reliable in providing CME initial kinematics, facilitating future studies on the origin and propagation of CMEs.

How to cite: Yang, Y., Wang, Z., and Shen, F.: Automatic Segmentation, Inpainting, and Tracking of CMEs By A Pixel-Annotation-Free System, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9144, https://doi.org/10.5194/egusphere-egu26-9144, 2026.

EGU26-9809 | ECS | Posters on site | ESSI1.18

Integrating Physics-Informed Neural Networks with Convolutional Neural Networks for Solar Flare Prediction 

Aribim Bristol-Alagbariya, Jonathan Eastwood, and Ben Moseley

Accurate forecasting of extreme solar flares is essential for mitigating space weather impacts on critical infrastructure, yet current deep learning approaches face fundamental limitations in operational reliability. Models often lack physical interpretability and may fail to generalize to configurations under-represented in training data, which are critical weaknesses
when forecasting rare extreme events. We take steps toward addressing these gaps by developing physics-informed architectures that embed magnetohydrodynamic (MHD) constraints directly into neural network training.

Using SDO/HMI SHARP vector magnetograms (2010–2021, 13,298 observations), we compare three approaches for 24-hour multi-class flare forecasting: (1) a ResNet34 baseline, (2) a reconstruction-physics hybrid enforcing MHD constraints through magnetic field reconstruction, and (3) a probability-physics hybrid coupling physics-derived features to classification probabilities. The probability-physics model achieves macro-averaged True Skill Statistic (TSS) of 0.389 [95% CI: 0.355–0.425] versus baseline 0.338 [0.301–0.375], a statistically significant 15% improvement (p < 0.001). Critically, physics-constrained models reduce divergence violations by two orders of magnitude, ensuring predictions satisfy fundamental conservation laws and remain physically interpretable across a broader range of magnetic configurations, including those under-represented in training data.

Feature space analysis reveals that intermediate C-class flares occupy transitional magnetic states with extensive overlap between non-flaring and extreme configurations, highlighting an intrinsic forecasting challenge that persists across architectures. M+ (M- and X-class) events maintain strong discrimination (AUC > 0.87) despite severe class imbalance, indicating that physically meaningful features can aid identification of extreme events even when training samples are scarce.

Our results suggest that embedding first-principles MHD constraints—divergence-free conditions, force-free equilibrium, and energy conservation—enhances both forecast skill and physical plausibility without increasing computational cost. The integration of physics-informed learning with CNN-based flare prediction offers a pathway toward improving operationally deployed systems with enhanced reliability for extreme event forecasting. For operational forecasters, improved physical interpretability may provide greater confidence in model predictions during critical decision-making, while reduced false alarm rates minimize unnecessary protective actions for satellite operators and power grid managers.


Keywords: extreme space weather, solar flare forecasting, physics-informed neural net-
works, operational reliability, magnetohydrodynamics, infrastructure risk mitigation

How to cite: Bristol-Alagbariya, A., Eastwood, J., and Moseley, B.: Integrating Physics-Informed Neural Networks with Convolutional Neural Networks for Solar Flare Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9809, https://doi.org/10.5194/egusphere-egu26-9809, 2026.

EGU26-11886 | ECS | Orals | ESSI1.18

Machine Learning for Solar Coronal Structure Segmentation on SDO AIA Data and Applications 

Panagiotis Gonidakis, Stefaan Poedts, and Jasmina Magdalenic

Automated identification of coronal structures using machine-learning techniques can support forecasting of extreme solar events, enable autonomous solar-observing missions, and accelerate understanding of physical processes in the solar atmosphere. Existing approaches typically focus on large-scale regions or adopt conservative segmentation strategies that limit structural detail. We train a lightweight variant of the You-Only-Look-Once (YOLO) object-detection framework [1] and, in parallel, design a scheme based on classical computer-vision operations and morphological filtering. Both are compared against the deep-learning-based SCSS-Net [2]. All three frameworks detect active regions and coronal holes in images from the Atmospheric Imaging Assembly onboard the Solar Dynamics Observatory. To reduce bias, training and testing use masks from multiple sources, including SPoCA [3], CHIMERA [4], Region Growth [5], and custom annotations. Methods are evaluated for scientific performance and computational cost using standard metrics such as the Dice score and Intersection over Union (IoU). We further assess on-board feasibility by outlining potential use cases and current technical limitations, and by evaluating performance on raw, uncalibrated data to ensure operational compatibility and robustness. Finally, we examine coronal hole mapping across multiple AIA wavelength channels and analyse correlations with signed and unsigned magnetic flux.



References

[1] Redmon et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

[2] Mackovjak et al. "SCSS-Net: solar corona structures segmentation by deep learning." Monthly Notices of the Royal Astronomical Society 508.3 (2021): 3111-3124.

[3] Verbeeck et al. "The SPoCA-suite: Software for extraction, characterization, and tracking of active regions and coronal holes on EUV images." Astronomy & Astrophysics 561 (2014): A29.

[4] Garto et al. "Automated coronal hole identification via multi-thermal intensity segmentation." Journal of Space Weather and Space Climate 8 (2018): A02.

[5] Tlatov, A., K. Tavastsherna, and V. Vasil’eva. "Coronal holes in solar cycles 21 to 23." Solar Physics 289.4 (2014): 1349-1358.

How to cite: Gonidakis, P., Poedts, S., and Magdalenic, J.: Machine Learning for Solar Coronal Structure Segmentation on SDO AIA Data and Applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11886, https://doi.org/10.5194/egusphere-egu26-11886, 2026.

EGU26-16681 | ECS | Posters on site | ESSI1.18

Automated Identification of Foreshock Transients 

Shi Tao, Lucile Turc, Souhail Dahani, Veera Lipsanen, Milla Kalliokoski, Mirja Ojuva, Nicolas Aunai, Hui Zhang, Shan Wang, and Savvas Raptis
Foreshock transients (FTs) are short-lived mesoscale structures near Earth's bow shock, typically generated by interactions between solar wind discontinuities and either the bow shock or foreshock backstreaming ions. They are characterized by a hot, low-density core, with reduced magnetic field strength and plasma velocity, and bounded by compressed edges.
 
In this study, we develop a machine learning pipeline to identify FTs using Cluster 1 spacecraft data from 2003–2009. We start with a catalog of 83 FT events and 300 solar wind/foreshock intervals, each has a time duration of 6 minutes and including magnetic field, plasma parameters, and 31 channels of backstreaming ion energy spectrogram as features. Seven 1D Convolutional Neural Networks (1D CNNs) are trained using a leave-one-year-out cross-validation approach. After that, each model is validated on solar wind/foreshock (SWF) regions corresponding to the held-out year. The model detects about 280 new FTs between 2003–2009 with precision of around 0.3. These detections, along with false positives, are then added to the training set to improve performance. When applied to 2010 SWF data, the updated model identifies 24 true positives with a precision of 0.5, compared to a precision of 0.2 when the additional training data is not included.
 
This study demonstrates the feasibility of an automated approach for FT detection. The updated model can be applied to data from other years or different Cluster spacecrafts. The resulting comprehensive FT catalog will support future studies on the properties of FTs, while the downstream false positives can serve as a calibration of the SWF catalog.

How to cite: Tao, S., Turc, L., Dahani, S., Lipsanen, V., Kalliokoski, M., Ojuva, M., Aunai, N., Zhang, H., Wang, S., and Raptis, S.: Automated Identification of Foreshock Transients, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16681, https://doi.org/10.5194/egusphere-egu26-16681, 2026.

EGU26-16741 | ECS | Posters on site | ESSI1.18

The Deep Learning-Based Dual-Branch Multimodal Fusion Model for Solar Flare Prediction 

Zhao Limin, Chen Xingyao, Zhu Xiaoshuai, Zhao Dong, and Yan Yihua

Solar flares are intense eruptive events caused by the rapid release of magnetic energy, often impacting Earth's space environment through electromagnetic radiation and high-energy particles. Accurate flare prediction is critical for space weather forecasting. However, many existing deep learning approaches often rely on single-modal inputs or shallow feature fusion, limiting their ability to capture complementary information. In this study, we propose a dual-branch multimodal fusion deep learning model for 24-hour solar flare prediction. The model integrates magnetograms and magnetic parameters through cross-attention mechanisms, followed by cross-scale interactions at the feature level to enhance multi-scale representation. It is designed to perform both binary prediction of ≥ C-class flares and multi-class classification of C, M, and X-class flares. To ensure rigorous evaluation, we employ a stratified group five-fold cross-validation scheme to preserve class representativeness and adopt a splitting-before-sampling strategy based on active region number to prevent data leakage. Experimental results show that the model achieves a TSS of 0.661 and an HSS of 0.630 for binary ≥ C-class prediction, while notably attaining a TSS of 0.780 and an HSS of 0.785 for X-class flares in the multi-class task. Compared with existing approaches, the model demonstrates superior performance in predicting intense X-class flares, effectively suppresses the false alarm rate, and exhibits strong generalization capability.

How to cite: Limin, Z., Xingyao, C., Xiaoshuai, Z., Dong, Z., and Yihua, Y.: The Deep Learning-Based Dual-Branch Multimodal Fusion Model for Solar Flare Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16741, https://doi.org/10.5194/egusphere-egu26-16741, 2026.

EGU26-17564 | ECS | Orals | ESSI1.18

Modelling of space plasma from Vlasov to fluid: machine learning applied to the closure problem 

Pietro Dazzi, Felipe Nathan de Oliveira Lopes, Hyun-Jin Jeong, Eric Calvet, and Rony Keppens

In our solar system, the main source of plasma is the Sun, which produces the so-called solar wind by continuously pushing its outermost layer -the corona- into space. The turbulent solar wind impinges on our planet and interacts with its magnetic field, creating a region of space called Earth’s magnetosphere. From its birth to its impact on our planet, the solar wind still harbors numerous unanswered questions. Answering these questions requires the numerical modelling of the plasma itself.

The most physically accurate numerical methods are based on kinetic modeling, which tracks the particles' velocity distribution function. However, these methods are numerically demanding since they involve modeling the complex six-dimensional particle distribution function as it evolves in time. To simplify the problem, such distribution is integrated over the velocity coordinates leading to the (more efficient) three-dimensional fluid plasma framework. Still, the passage to the fluid equations comes with an important caveat. The fluid system of equations needs to be closed by choosing a proper “closure”. The objective of this work is to tackle the closure problem by employing a combination of kinetic simulation and machine learning techniques.

We perform multiple decaying turbulence plasma simulations using a Hybrid-PIC [1] (i.e. kinetic ions, fluid electrons) model. By varying different physical parameters, notably the ion beta, we explore the variability of the solar wind. These kinetic simulations serve as the ground truth to train a machine learning model. The machine's task is to "learn" the best approximation for the closure equation. We focus in particular on the reconstruction of the pressure tensor. We explore various machine learning techniques [2, 3] (CNN, GAN, FNO) that have shown promise in atmospheric science but are new to this specific problem. We show how this reconstructed closure performs better than other analytical approximations [4] (polytropic, CGL, CGL+FLR effects). The final goal is to learn a closure equation that can effectively incorporate complex kinetic physics into a simplified, yet more accurate, fluid simulation. This will significantly increase the fidelity of solar wind models without making them prohibitively expensive to compute.

[1] Behar, Etienne, Shahab Fatemi, Pierre Henri, e Mats Holmström. «Menura: A Code for Simulating the Interaction between a Turbulent Solar Wind and Solar System Bodies». Annales Geophysicae 40, fasc. 3 (2022): 281–97. https://doi.org/10.5194/angeo-40-281-2022.

[2] Kovachki, Nikola, Zongyi Li, Burigede Liu, et al. «Neural Operator: Learning Maps Between Function Spaces». Preprint, 2 maggio 2024. https://doi.org/10.5555/3648699.3648788.

[3] Jeong, Hyun-Jin, Mingyu Jeon, Daeil Kim, et al. «Prediction of the Next Solar Rotation Synoptic Maps Using an Artificial Intelligence–Based Surface Flux Transport Model». The Astrophysical Journal Supplement Series 278, fasc. 1 (2025): 5. https://doi.org/10.3847/1538-4365/adc447.

[4] Hunana, P., A. Tenerani, G. P. Zank, et al. «An Introductory Guide to Fluid Models with Anisotropic Temperatures. Part 1. CGL Description and Collisionless Fluid Hierarchy». Journal of Plasma Physics 85, fasc. 6 (2019): 205850602. https://doi.org/10.1017/S0022377819000801.

How to cite: Dazzi, P., de Oliveira Lopes, F. N., Jeong, H.-J., Calvet, E., and Keppens, R.: Modelling of space plasma from Vlasov to fluid: machine learning applied to the closure problem, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17564, https://doi.org/10.5194/egusphere-egu26-17564, 2026.

EGU26-18188 | ECS | Orals | ESSI1.18

Probabilistic Solar Flare Forecasting via Weakly Supervised Contrastive Refinement of VAE Latent Spaces 

Ekatarina Dineva, Jasmina Magdalenic, George Miloshevich, Panagiotis Gonidakis, Francesco Carella, and Stefaan Poedts

Reliable solar flare forecasting is limited by two forms of class imbalance in active region time series: (i) the overwhelming dominance of the non-flaring, quiet state over the eruptive state, and (ii) the insufficient separability between common, physically similar event classes (e.g. C-class versus M-class flares). Although empirical parameters derived from the photospheric vector magnetic field (VMF), such as those provided by SDO/HMI SHARP products, capture aspects of active region complexity and free energy buildup, they often evolve smoothly and overlap across flare classes. Consequently, while many models can distinguish between flares and no-flares reasonably well, they struggle to distinguish flare magnitude and association with eruptive phenomena (e.g. CMEs) using photospheric information alone. This suggests that improved flare-class separation requires (a) the explicit definition of what constitutes 'similarity' between pre-flare states, and (b) parametrization that emphasizes flare-relevant structure over common active region features.

We investigate a representation learning strategy that combines the parametrization of SDO/HMI SHARP VMF cutouts using a Variational Autoencoder (VAE) with a contrastive stage to reshape the resulting embedding geometry. First, a VAE is trained to encode SHARP cutouts into compact latent vectors that capture active region morphology. These vectors are then refined using a Siamese-like objective constructed from weak supervision, which uses event labels and empirical SHARP parameters as proxies for elevated flare likelihood. The contrastive stage then uses this weak supervision to encourage a latent geometry that better reflects flare-relevant evolution. This study emphasizes latent-space structure, i.e. neighborhood consistency and class-conditional clustering, and evaluates whether these properties facilitate improved probabilistic prediction across multiple forecast horizons, by training lightweight downstream models on (i) empirical parameters, (ii) VAE latents and (iii) their combined representations.

How to cite: Dineva, E., Magdalenic, J., Miloshevich, G., Gonidakis, P., Carella, F., and Poedts, S.: Probabilistic Solar Flare Forecasting via Weakly Supervised Contrastive Refinement of VAE Latent Spaces, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18188, https://doi.org/10.5194/egusphere-egu26-18188, 2026.

EGU26-18327 | ECS | Posters on site | ESSI1.18

A Data-Driven Phase-Space View of Sub-Alfvénic Magnetic-Cloud Coupling 

Sayanee Haldar

Sub-Alfvénic solar wind intervals predominantly transpire into the core of magnetic clouds (MC) during interplanetary coronal mass ejection (ICME) events, facilitating an intense mode of solar wind-magnetosphere interaction wherein energy and information can propagate via magnetic field lines. These phenomena are associated with intense magnetic fields, low plasma beta, heightened Alfvénic activity, and exceptionally effective energy transfer to the magnetospheric domain. This study employs a physics-informed machine learning framework to identify and characterize the sub-Alfvénic magnetic cloud regime using data from many solar cycles. A feature space motivated by physical principles is established based on the plasma characteristics of upstream solar wind observed from the L1 point, along with metrics of wave activity obtained from time-frequency analysis. Employing unsupervised machine learning, the high-dimensional solar-wind feature space is mapped onto a low-dimensional latent space that elucidates the intrinsic organization of solar-wind plasma regimes. By integrating recognized MC occurrences and disparate individual case studies of sub-Alfvénic flow onto the established phase-space map, it has been deduced that severe coupling conditions are indicative of a cohesive global regime of solar wind behavior rather than isolated anomalies. This framework also illustrates transition paths among background solar wind, sheaths, and magnetic cloud cores, utilizing the evolution of coupling conditions during interplanetary coronal mass ejection passages.

 

How to cite: Haldar, S.: A Data-Driven Phase-Space View of Sub-Alfvénic Magnetic-Cloud Coupling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18327, https://doi.org/10.5194/egusphere-egu26-18327, 2026.

EGU26-18421 | ECS | Posters on site | ESSI1.18

Automatic Spatio-Temporal Differential Emission Reconstruction Method 

Junyan Liu, Stefaan Poedts, Chenglong Shen, and Jiajia Liu

Current analyses of solar differential emission measure predominantly rely on two-dimensional (2D) imaging and interpretation, which inherently limit our ability to fully capture the true three-dimensional (3D) characteristics of coronal structures and dynamic processes. This 2D perspective consequently hinders a comprehensive understanding of the complex physical processes governing the solar atmosphere.

To address these limitations, we present a novel methodology for the spatio-temporal reconstruction of the low solar corona, with several machine learning techniques. This approach enables us to reconstruct several physical parameters, including EUV radiation, temperature, and electron density, across varying altitudes and observation time. Based on these 3D reconstruction results, our method can further generate synthetic observational images from various viewpoints and times, providing a comprehensive visualisation of the corona's dynamic 3D structure. Furthermore, it can estimate missing wavelength observations for missions such as Solar Orbiter. This significantly supports multi-spacecraft collaborative observations and data fusion efforts. Besides, our reconstructed results can also serve as an enhanced initial state for coronal and interplanetary simulations.

How to cite: Liu, J., Poedts, S., Shen, C., and Liu, J.: Automatic Spatio-Temporal Differential Emission Reconstruction Method, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18421, https://doi.org/10.5194/egusphere-egu26-18421, 2026.

EGU26-18949 | Posters on site | ESSI1.18

Advancing Inter-Satellite Radio Occultation with MaCro on the M-MATISSE Mission 

Tom Andert, Martin Pätzold, Tobias Vorderobermeier, Matthias Hahn, Silvia Tellmann, Janusz Oschlisniok, Kerstin Peter, and Benjamin Haser

Radio occultation (RO) techniques provide valuable remote-sensing insights into planetary ionospheres and atmospheres by measuring the bending of radio signals as they traverse atmospheric layers. Mutual radio occultations between the Trace Gas Orbiter (TGO) and Mars Express (MEX) demonstrated the feasibility of this approach but were limited by hardware not designed for radio science occultation measurements—most notably, the absence of ultra-stable oscillators, single-frequency operation, and restricted timing precision.

The Mars Magnetosphere ATmosphere Ionosphere and Space-weather SciencE (M-MATISSE) mission—currently in its Phase A study by the European Space Agency (ESA)—is a Medium-class (M7) candidate that will overcome these constraints through the dedicated MaCro (Mars Crosslink Radio Occultation) instrument: a dual-frequency, precision-timed, ultra-stable radio system purpose-built for inter-satellite occultations. MaCro’s design enables high-accuracy profiling of the Martian ionosphere and atmosphere across diverse geometries and solar conditions.

This study systematically investigates how the known limitations of TGO–MEX influenced the retrieved electron density profiles and explores how modern machine-learning techniques—for example regression-based drift correction—can enhance the data-processing pipeline. The outcomes of this work will support the development of MaCro’s data processing chain and contribute to the improvement of its performance.

How to cite: Andert, T., Pätzold, M., Vorderobermeier, T., Hahn, M., Tellmann, S., Oschlisniok, J., Peter, K., and Haser, B.: Advancing Inter-Satellite Radio Occultation with MaCro on the M-MATISSE Mission, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18949, https://doi.org/10.5194/egusphere-egu26-18949, 2026.

EGU26-22082 | Posters on site | ESSI1.18

Physics-informed time-dependent deep neural network for solar wind prediction 

Veronique Delouille, Kaijie Li, Farzad Kamalabadi, and Joseph Davila

In this work, we aim to advance the prediction of solar wind speed several days in advance. The approach is based on analyzing solar coronal images in conjunction with solar wind speed.  We create labelled data pairs from over a decade of EUV images obtained from the SDO/AIA and solar wind data at 1AU recorded by ACE, WIND, and DISCOVR.  We use the archived SDO machine-learning ready dataset (SDO-ML), and the solar wind speed at 1AU from the NASA OMNIWEB dataset. We construct a deep neural network model and capture the temporal component of the solar wind propagation with a time-dependent neural network, e.g., Recurrent Neural Network. Physical constraints are incorporated to train the model and optimize the prediction. The generalization capability of our model is investigated via cross-validation, whereby careful separation into training, validation, and test datasets is performed as a function of solar activity. We report on the impact of the deep neural network architecture as a universal function approximation in its ability to capture the temporal relationship between solar EUV characteristics and solar wind speed at 1 AU. 

How to cite: Delouille, V., Li, K., Kamalabadi, F., and Davila, J.: Physics-informed time-dependent deep neural network for solar wind prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22082, https://doi.org/10.5194/egusphere-egu26-22082, 2026.

EGU26-719 | ECS | Posters on site | AS5.1

Does AI Learn Physics? Assessing the Physical Fidelity of Data-Driven Tropical Cyclone Forecasts 

Pankaj Sahu, Sukumaran Sandeep, and Hariprasad Kodamana

Machine Learning Weather Prediction (MLWP) models—specifically GraphCast, PanguWeather, Aurora, and FourCastNet—show great promise for competing with physics-based Numerical Weather Prediction (NWP) models by providing global forecasts at a low computational cost. However, a thorough physical evaluation is needed before they can be used in place of NWP models. Our comprehensive study comparing these four leading MLWP models with NWP and observations in Tropical Cyclone (TC) forecasting across all tropical basins uncovers a significant duality: MLWP models are very good at predicting the TC track (with an average error of less than 200 km at a 96-hour lead time) because they accurately capture the underlying dynamics. However, they always underestimate the maximum sustained wind speeds (intensity). This systematic low intensity bias is directly related to biases that come from their ERA5 training data and are made worse by penalties. Even with this limitation, the models accurately depict important physical structures, such as low-level convergence and the vertical warm core, while also keeping different physical fields consistent. This suggests that the models learn how different dynamical and thermodynamical processes are related to each other in a way that makes sense. Ultimately, although MLWPs, especially Aurora, exhibit an implicit comprehension of TC dynamics, their enduring intensity bias requires additional refinement prior to their complete substitution of NWP models.

How to cite: Sahu, P., Sandeep, S., and Kodamana, H.: Does AI Learn Physics? Assessing the Physical Fidelity of Data-Driven Tropical Cyclone Forecasts, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-719, https://doi.org/10.5194/egusphere-egu26-719, 2026.

EGU26-2452 | ECS | Posters on site | AS5.1

Climate Grey-Box Flow Matching for Robust Climate and Weather Prediction 

Gurjeet Singh, Frantzeska Lavda, and Alexandros Kalousis
Deep generative models such as flow matching and diffusion models have great potential for learning complex dynamical systems, but they typically act as black boxes, neglecting underlying physical structure. In contrast, physics-based models governed by ODEs and PDEs provide interpretability and physical consistency, yet are often incomplete due to unresolved processes, missing source terms, or uncertain parameterisations. Bridging these two paradigms is a central challenge in data-driven weather and climate modelling.

We propose a Climate Grey-Box Dynamics Matching framework designed for weather and climate systems, that explicitly combines existing physical models with data-driven learning to capture unresolved dynamics where known physical operators are directly embedded into the learned dynamics. Our framework learns from observational trajectories alone and operates in a simulation-free manner inspired by gradient matching and flow matching methods. By avoiding numerical solvers, it eliminates the memory overhead, computational cost, and numerical instability associated with Neural ODE–based approaches.

To capture temporal dependencies in our simulation-free method, we introduce a lightweight attention-based temporal encoder that aggregates short-term history in a physically consistent manner. This design enables the model to represent unresolved dynamics without increasing computational complexity, making it well-suited for high-dimensional spatiotemporal climate systems. We apply this framework to weather and climate forecasting and demonstrate its effectiveness against ClimODE, a state-of-the-art solver-based grey-box model. Reformulating ClimODE as a simulation-free grey-box model reduces training complexity from Ο(L) to Ο(1), where L denotes the number of solver steps. Beyond computational gains, the simulation-free formulation yields substantial memory efficiency: training is possible on a single RTX 3060 (12 GB), whereas ClimODE requires at least 25 GB of GPU memory with a small batch size. This enables efficient training on commodity hardware and improves accessibility for large-scale climate modelling.

Experiments on weather and climate benchmarks show that the proposed method achieves improved forecast accuracy and faster convergence compared to simulation-based and fully data-driven baselines. The method demonstrates particular robustness to long horizons, as performance gains become more pronounced with extended forecast times—indicating enhanced temporal stability and resistance to error accumulation, an essential property for reliable long-range climate prediction.

How to cite: Singh, G., Lavda, F., and Kalousis, A.: Climate Grey-Box Flow Matching for Robust Climate and Weather Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2452, https://doi.org/10.5194/egusphere-egu26-2452, 2026.

Artificial intelligence (AI)-based data-driven weather prediction (AIWP) models have experienced rapid progress over the last years. They achieve impressive results and demonstrate substantial improvements over state-of-the-art physics-based numerical weather prediction (NWP) models across a range of variables and evaluation metrics. However, most efforts in data-driven weather forecasting have been limited to deterministic, point-valued predictions, making it impossible to quantify forecast uncertainties, which is crucial in research and for optimal decision making in applications.

I will present recent work on uncertainty quantification (UQ) methods in the context of data-driven weather prediction. The post-hoc use of UQ methods enables the generation of skillful probabilistic weather forecasts from a state-of-the-art deterministic AIWP model [1]. Further, by subjecting the deterministic backbone of physics-based and data-driven models post hoc to the same UQ technique, and computing the in-sample mean continuous ranked probability score of the resulting forecast, we propose a new measure that enables fair and meaningful comparisons of single-valued output from AIWP and NWP models, called potential continuous ranked probability score [2].

References

[1] Bülte, C., Horat, N., Quinting, J. and Lerch, S. (2025). Uncertainty quantification for data-driven weather models. Artificial Intelligence for the Earth System, in press. DOI:10.1175/AIES-D-24-0049.1

[2] Gneiting, T., Biegert, T., Kraus, K., Walz, E.-M., Jordan, A. I., and Lerch, S. (2025). Probabilistic measures afford fair comparisons of AIWP and NWP model output. Preprint, arXiv:2506.03744. DOI:10.48550/arXiv.2506.03744

How to cite: Lerch, S.: Uncertainty quantification for data-driven weather prediction: From probabilistic forecasts to fair model comparisons, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2971, https://doi.org/10.5194/egusphere-egu26-2971, 2026.

EGU26-3091 | ECS | Posters on site | AS5.1

Learning to sample unprecedented atmospheric rivers 

Tim Whittaker and Alejandro Di Luca

Atmospheric rivers (ARs) are the dominant drivers of hydrological extremes along the western coast of North America, yet the physical upper limits of their intensity remain poorly understood and weakly constrained by the short observational record. While thermodynamic amplification of ARs under climate change is well-documented, the potential for dynamical amplification driven by the wind field remains uncertain and computationally expensive to sample using conventional techniques such as large ensembles of simulations. Here, we address this sampling barrier by leveraging techniques from machine learning, specifically combining a differentiable global climate model with high-resolution regional downscaling to generate storylines of unprecedented AR events in western Canada. By formulating the event generation as an optimal control problem, we compute the gradients of the model’s output to learn minimal, physically plausible perturbations to historical initial states that maximize AR’s associated integrated vapour transport at landfall. These optimized storylines are further dynamically downscaled using a high-resolution regional climate model, producing extreme precipitation events that significantly exceed historical benchmarks. 

How to cite: Whittaker, T. and Di Luca, A.: Learning to sample unprecedented atmospheric rivers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3091, https://doi.org/10.5194/egusphere-egu26-3091, 2026.

EGU26-3927 | Posters on site | AS5.1

Few-shot learning for mid-latitude climate forecasts 

Yoo-Geun Ham, Seol-Hee Oh, and Gyuhui Kwon

Reliable prediction of climate variables and high-impact extremes in the midlatitudes is crucial for climate risk assessment, agricultural planning, water resource management, and disaster preparedness. However, conventional deep learning–based approaches for midlatitude climate prediction trained with dynamical climate models (e.g., CMIP models) can cause systematic errors in capturing the observed climate-relevant signals, ultimately limiting prediction skill. These limitations highlight the need to improve midlatitude prediction by detecting climate signals solely from the limited numbers of reliable observational climate data. To address the challenge of limited training samples, we employ the model-agnostic meta-learning (MAML) algorithm along with domain-knowledge-based data augmentation to predict mid-latitude winter temperatures. The proposed data augmentation is purely based on the observed data by defining the labels using large-scale climate variabilities associated with the target variable. The MAML-applied convolutional neural network (CNN) demonstrates superior correlation skills for winter temperature anomalies compared to a reference model (i.e., the CNN without MAML) and state-of-the-art dynamical forecast models across all target lead months during the boreal winter seasons. Moreover, occlusion sensitivity results reveal that the MAML model better captures the physical precursors that influence mid-latitude winter temperatures, resulting in more accurate predictions.

How to cite: Ham, Y.-G., Oh, S.-H., and Kwon, G.: Few-shot learning for mid-latitude climate forecasts, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3927, https://doi.org/10.5194/egusphere-egu26-3927, 2026.

EGU26-4301 * | Orals | AS5.1 | Highlight

Numerical models outperform AI weather forecasts of record-breaking extremes 

Zhongwei Zhang, Erich Fischer, Jakob Zscheischler, and Sebastian Engelke

Artificial intelligence (AI)-based models are revolutionizing weather forecasting and have surpassed leading numerical weather prediction systems on various benchmark tasks. However, their ability to extrapolate and reliably forecast unprecedented extreme events remains unclear. Here, we show that for record-breaking weather extremes, the numerical model High RESolution forecast (HRES) from the European Centre for Medium-Range Weather Forecasts still consistently outperforms state-of-the-art AI models GraphCast, GraphCast operational, Pangu-Weather, Pangu-Weather operational, and Fuxi. We demonstrate that forecast errors in AI models are consistently larger for record-breaking heat, cold, and wind than in HRES across nearly all lead times. We further find that the examined AI models tend to underestimate both the frequency and intensity of record-breaking events, and they underpredict hot records and overestimate cold records with growing errors for larger record exceedance. Our findings underscore the current limitations of AI weather models in extrapolating beyond their training domain and in forecasting the potentially most impactful record-breaking weather events that are particularly frequent in a rapidly warming climate. Further rigorous verification and model development is needed before these models can be solely relied upon for high-stakes applications such as early warning systems and disaster management.

How to cite: Zhang, Z., Fischer, E., Zscheischler, J., and Engelke, S.: Numerical models outperform AI weather forecasts of record-breaking extremes, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4301, https://doi.org/10.5194/egusphere-egu26-4301, 2026.

EGU26-5719 | ECS | Posters on site | AS5.1

An AI-based framework for high-resolution climate dataset over Italy: from historical reconstruction to an operational chain 

Ilenia Manco, Otavio Medeiros Feitosa, Mario Raffa, and Paola Mercogliano

High-resolution climate datasets are fundamental for monitoring extreme events, assessing climate variability, and supporting climate adaptation strategies. However, producing high-resolution climate reanalyses usually requires computationally expensive dynamical downscaling. As a result, near–real-time high-resolution climate services remain limited, since most downscaling products are generated retrospectively with delays of months to years (Hersbach et al., 2020; Harris et al., 2022). Recent advances in generative machine learning enable realistic fine-scale atmospheric fields that preserve spatial coherence and key statistics, including extremes (Rampal et al., 2025; Camps-Valls et al., 2025). Hybrid statistical–dynamical approaches therefore provide an efficient and physically consistent pathway for operational high-resolution dataset production (Glawion et al., 2025; Schmidt et al., 2025). This work presents the progress achieved in the development of a high-resolution climate datasets over the Italian Peninsula at 2.2 km resolution, exploiting a conditional Generative Adversarial Network (cGAN) model developed in Manco et al. (2025). The framework follows a hybrid statistical–dynamical downscaling strategy, in which ERA5 reanalysis data at 0.25° resolution are downscaled using cGANs trained against the very-high-resolution dynamical product VHR-REA_IT (Raffa et al., 2021). The system has been extended to multiple near-surface atmospheric variables, including mean, minimum, and maximum 2 m temperature, relative surface humidity, cumulative precipitation, and 10 m wind (speed and direction), the latter two representing particularly challenging targets (Fig. 1). Each variable is downscaled using a dedicated cGAN trained independently to learn the non-linear spatial relationships between coarse-resolution ERA5 predictors and high-resolution VHR-REA_IT targets, while employing a common network architecture and loss function to ensure methodological consistency. This enabled the production of a high-resolution historical dataset covering the period 1990–2024 at daily frequency, with 1990–2000 used for training. Since January 2025, the framework (Fig. 2) has been integrated into an operational chain and used to generate high-resolution fields in near real time, automatically updating the dataset as new ERA5 data become available, with an average latency of approximately six days. All data are distributed in NetCDF format through the CMCC Data Delivery System (https://dds.cmcc.it/) within the FAIR (Fast AI Reanalysis) product, with daily maps accessible via the Dataclime dashboard (https://www.dataclime.com/). Both deterministic and probabilistic configurations of the cGAN framework are presented. Results, evaluated against the dynamically downscaled fields available at the same resolution over the common historical period, show that the proposed approach robustly reproduces spatial patterns (Fig. 3), mean values, and variability across all variables. The probabilistic configuration improves uncertainty representation and shows skill in capturing both mean conditions and extremes. Overall, the framework represents a versatile and robust solution for the generation of high-resolution climate datasets in both historical and operational contexts. Remaining limitations primarily concern the representation of extreme precipitation percentiles in regions characterized by complex orography, which will be the focus of future developments.

Fig. 1 – Wind speed at 10 m for a random day.

Fig. 2 - c-GAN Training Framework

Fig. 3 – Seasonal Analysis. 2-m minimum temperature.

 

How to cite: Manco, I., Feitosa, O. M., Raffa, M., and Mercogliano, P.: An AI-based framework for high-resolution climate dataset over Italy: from historical reconstruction to an operational chain, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5719, https://doi.org/10.5194/egusphere-egu26-5719, 2026.

EGU26-6394 | ECS | Orals | AS5.1

A review of spatially explicit climate emulators for enhancing modelling agility 

Sarah Schöngart, Lukas Gudmunsson, Chris Womack, Carl-Friedrich Schleussner, and Sonia Seneviratne

Machine-learning-based weather and climate emulators are rapidly transforming how climate information is generated and applied by enabling fast scenario exploration, large ensemble analysis, and the generation of decision-relevant climate data at scales beyond the reach of traditional climate models. Emulators are increasingly integrated into policy-relevant assessments and are expected to play a growing role in upcoming IPCC reports. Yet the field remains fragmented as task definitions and evaluation standards differ across communities, and frameworks for connecting short-term weather emulation to long-term climate projections are missing..

Here, we synthesise 77 studies on spatially explicit climate, hybrid weather-climate, and weather emulators within a unified conceptual framework, mapping inputs and outputs, methodological choices, validation practices, and computational requirements. Three structural patterns emerge. First, most climate emulators prioritise computational speed and scenario agility but offer limited output flexibility, typically generating gridded fields for a narrow set of variables. Second, the emulator landscape is fragmented: weather and hybrid weather-climate emulators form a coherent, machine-learning-driven cluster, whereas climate emulators are more heterogeneous, less connected to machine-learning advances, and validated inconsistently. Third, state-of-the-art weather emulators often rely on specialised hardware and institutional resources concentrated in a few organisations, raising questions of computational equity and “agility for whom”.

Our findings suggest that realizing genuine agility will require future research to focus on user-tailored outputs, rigorous evaluation across forcing scenarios, cross-domain methodological integration, and equitable access to computational resources. These priorities will help the field transition from methodological innovation toward policy-relevant application.

How to cite: Schöngart, S., Gudmunsson, L., Womack, C., Schleussner, C.-F., and Seneviratne, S.: A review of spatially explicit climate emulators for enhancing modelling agility, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6394, https://doi.org/10.5194/egusphere-egu26-6394, 2026.

EGU26-7801 | Posters on site | AS5.1

Bridging Physics and Machine Learning to Enhance Weather Forecasting at ECCC 

Emilia Diaconescu, Jean-François Caron, Valentin Dallerit, Stéphane Gaudreault, Syed Husain, Shoyon Panday, Carlos Pereira Frontado, Leo Separovic, Christopher Subich, Siqi Wei, and Sasa Zhang

Environment and Climate Change Canada (ECCC) is actively advancing the integration of artificial intelligence (AI) into numerical weather prediction (NWP) through a coordinated research-to-operations strategy that combines state-of-the-art machine learning approaches with established physical modeling frameworks. This presentation summarizes the progress achieved to date.

We first describe the development of GEML (Global Environmental eMuLator), a global AI forecast model, based on Google DeepMind’s GraphCast, trained and fine-tuned in-house using ERA5 reanalysis and ECMWF operational analyses. Building on GEML, ECCC has implemented an experimental hybrid AI–NWP global forecasting system, GDPS-SN, which applies large-scale spectral nudging to improve the operational Global Deterministic Prediction System (GDPS) by leveraging the large-scale accuracy of GEML.

The presentation also introduces a description of PARADIS, a fully Canadian, physically inspired, AI-based weather forecast model, developed by ECCC and its partners. These activities illustrate ECCC’s strategic vision for AI-enabled weather prediction by combining scientific rigor, collaboration and  operational relevance to deliver more accurate forecasting systems.

 

How to cite: Diaconescu, E., Caron, J.-F., Dallerit, V., Gaudreault, S., Husain, S., Panday, S., Pereira Frontado, C., Separovic, L., Subich, C., Wei, S., and Zhang, S.: Bridging Physics and Machine Learning to Enhance Weather Forecasting at ECCC, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7801, https://doi.org/10.5194/egusphere-egu26-7801, 2026.

EGU26-8656 | Posters on site | AS5.1

Harnessing Data-Driven Weather Prediction (DWP) Model for Climate Modeling 

Chia-Ying Tu, Yu-Chi Wang, Chung-Cheh Chou, and Zheng-Yu Yan

Recent advancements in AI/ML-based Data-Driven Weather Prediction (DWP) have revolutionized meteorological forecasting. By leveraging deep learning architectures trained on the ECMWF ERA5 reanalysis, DWP models can iteratively predict atmospheric states with accuracy comparable to traditional Numerical Weather Prediction (NWP) while requiring orders of magnitude less computational power. However, DWP’s reliance on historical training data poses challenges for climate-scale simulations, particularly in representing evolving phenomena influenced by non-stationary climate change. This study investigates the applicability of the GraphCast DWP model for climate research, specifically focusing on its potential for global climate downscaling and bias correction.

To evaluate performance across varying initial conditions, we conducted three distinct 72-hour GraphCast integration experiments. The first experiment utilized high-resolution (0.25°) ERA5 data from 2000–2010 to assess model reproducibility (H-ERA5), while the second experiment employed low-resolution (1.0°) ERA5 data to quantify sensitivity to initial horizontal grid spacing (L-ERA5). In the third experiment, we utilized 36 years (1979–2014) of HiRAM climate simulations as initial conditions to evaluate a novel DWP-based climate modeling framework (GC-HiRAM).

Results from the H-ERA5 and L-ERA5 experiments demonstrate that GraphCast effectively reproduces the climate mean state and variance of the ERA5 dataset. However, both experiments exhibited an underestimation of tropical cyclone (TC) frequency and intensity, consistent with known TC climatology biases in ERA5. Notably, the GC-HiRAM experiment closely aligned with the mean states and long-term trends of the original HiRAM simulations while yielding precipitation and surface temperature variances comparable to ERA5. Interestingly, the inherent TC underestimation in GraphCast served as a functional bias correction for HiRAM, which traditionally overestimates TC frequency, thereby improving overall simulation skill. Our findings suggest that this innovative DWP-driven approach provides a computationally efficient and robust framework for global climate modeling, effectively capturing essential climate phenomena while introducing a viable pathway for high-resolution climate downscaling and ensemble simulations.

How to cite: Tu, C.-Y., Wang, Y.-C., Chou, C.-C., and Yan, Z.-Y.: Harnessing Data-Driven Weather Prediction (DWP) Model for Climate Modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8656, https://doi.org/10.5194/egusphere-egu26-8656, 2026.

Recently developed AI weather models have been widely recognized for revolutionizing weather prediction, producing forecasts more skillful than traditional models at a fraction of the computational cost. Here I will argue that the next phase of the revolution involves the adjoints of these models, applied to a wide range of problems, including novel exploration of dynamical process in weather and climate variability, extreme events, and new data assimilation systems. Adjoints are derived from gradient operations on the forward model, and are useful for measuring the sensitivity of model outputs to inputs and parameters. Historically adjoints have been derived for a limited set of traditional models, and mainly applied to problems in data assimilation. The ubiquitous availability of adjoints for AI models makes these tools easily accessible and available for a much wider range of applications. Specific examples I will discuss include shadowing trajectories for predictability, "gray swans" and a factory for out-of-sample extreme events, and mechanistic interpretability of specific phenomena.

How to cite: Hakim, G.: Using Adjoints of AI-based Weather Models to Study Predictability and Extreme Events, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8870, https://doi.org/10.5194/egusphere-egu26-8870, 2026.

EGU26-9387 | Orals | AS5.1

Evaluating emergent climate behaviour in a hybrid machine learned atmosphere -- dynamical ocean model 

Hannah Christensen, Bobby Antonio, and Kristian Strommen

Understanding how fast atmospheric variability shapes slow climate variability and sensitivity is a central challenge in Earth-system science. Recent advances in machine-learned (ML) atmospheric models have demonstrated remarkable skill on weather timescales, but their emergent behaviour in a fully coupled climate system is largely unexplored. We present results from a new hybrid modelling framework that couples a machine-learned atmosphere to a dynamical ocean model. We report on a set of 70-year coupled simulations (1950–2020 historical forcing and fixed-1950s control) in which the ACE2 ML climate emulator is interactively coupled to the NEMO ocean model. These experiments represent, to our knowledge, the first multi-decadal integrations of a machine-learned atmosphere interacting with a full-depth dynamical ocean. We assess the behaviour of the coupled system, with particular focus on low-frequency tropical variability and the climate response to greenhouse-gas forcing. Preliminary results indicate realistic emergent El Nino-like variability and a physically plausible climate sensitivity, suggesting that key atmosphere–ocean feedbacks can be captured within a hybrid ML–dynamical framework. These results evaluate the possible role of entirely machine-learned components in next-generation Earth-system models.

How to cite: Christensen, H., Antonio, B., and Strommen, K.: Evaluating emergent climate behaviour in a hybrid machine learned atmosphere -- dynamical ocean model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9387, https://doi.org/10.5194/egusphere-egu26-9387, 2026.

EGU26-9811 | ECS | Posters on site | AS5.1

Explaining neural networks for detection of atmospheric features in gridded data 

Tim Radke, Susanne Fuchs, Iuliia Polkova, Christian Wilms, Johanna Baehr, and Marc Rautenhaus

Detection of atmospheric features in gridded datasets is typically done by means of rule-based algorithms. Recently, the feasibility of learning feature detection tasks using supervised learning with convolutional neural networks (CNNs) has been demonstrated. This approach corresponds to semantic segmentation tasks widely investigated in computer vision. However, while in recent studies the performance of CNNs was shown to be comparable to human experts, CNNs are largely treated as a “black box”, and it remains unclear whether they learn the features for physically plausible reasons. Here, we build on recently published studies that discuss datasets containing features of tropical cyclones (TCs), atmospheric rivers (ARs), and atmospheric surface fronts (SFs) as detected by human experts. We adapt the explainable artificial intelligence technique “Layer-wise Relevance Propagation” to the semantic segmentation task and investigate which input information CNNs with the Context-Guided Network (CGNet) and U-Net architectures use for feature detection. We find that for the detection of TCs and ARs, both CNNs indeed consider plausible patterns in the input fields of atmospheric variables. For instance, relevant patterns include point-shaped extrema in vertically integrated precipitable water (TMQ) and circular wind motion for TCs. For ARs, relevant patterns include elongated bands of high TMQ and eastward winds. Such results help to build trust in the CNN approach. In contrast, for the detection of SFs, we find only partially physically plausible patterns. While U-Net uses regions of changing temperature and humidity as well as strong wind shears to detect SFs, we also find noisy patterns relating to spurious correlations with the background data. To assess whether these implausible patterns reduce U-Net's generalizability, we evaluate it on a different SF dataset. Here, depending on the domain, SFs are often erroneously detected, especially in the Tropics and Arctic, highlighting the importance of analyzing whether patterns learned by a CNN are physically plausible. We also demonstrate application of the approach for finding the most relevant input variables and evaluating detection robustness when changing the input domain.

How to cite: Radke, T., Fuchs, S., Polkova, I., Wilms, C., Baehr, J., and Rautenhaus, M.: Explaining neural networks for detection of atmospheric features in gridded data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9811, https://doi.org/10.5194/egusphere-egu26-9811, 2026.

EGU26-12464 | Posters on site | AS5.1

Attribution of convective rainfall events using AI-downscaling – how extreme can we go? 

Georgie Logan, Daniel Cotterill, Mark McCarthy, Andrew Ciavarella, Henry Addison, Peter Watson, and Tomas Wetherell

Probabilistic attribution of extreme events requires large-ensemble climate model simulations, for both present and counterfactual climates, to adequately capture the tails of the distribution. Accurately modelling rainfall extremes, particularly those involving convection, or rainfall over regions with complex topography, requires high-resolution climate models. High-resolution climate data is particularly important for impact attribution to simulate realistic flood inundation as input to flood models.

Large ensembles of climate model runs for pre-industrial climates do not currently exist at convection-permitting resolution, as conventional convection-permitting models are computationally expensive to run. Therefore, attribution studies on extreme localised convective rainfall events are limited, despite the large impacts these events have on society.

To address this, we create a convective-permitting-resolution, large-ensemble dataset for England and Wales using a generative AI approach to downscale a pre-existing large ensemble of attribution runs from the HadGEM3 climate model. We use the diffusion model CPMGEM from Addison et al. (2025), which is trained and tested on the convection-permitting-resolution UK local Climate Projections data. We use CPMGEM, which enables stochastic generation of multiple samples per coarse model input, to generate multiple high-resolution precipitation samples from our original large-ensemble dataset. This process is relatively computationally cheap and enables creation of a high-resolution dataset that is larger than the input dataset.

We first investigate the ability of CPMGEM to be applied to a different configuration of the model it was trained on, and on an alternative set of counterfactuals. We also explore its ability to conserve climate trends and reproduce realistic values for the extremes.

We then assess the validity of using the downscaled dataset for attribution studies. If suitable, we will revisit a number of relevant attribution studies of extreme rainfall events and compare the original results from the coarse climate model HadGEM3-A to our new results using the high-resolution downscaled CPMGEM output. Overall, this could significantly extend the capability to attribute localised extreme rainfall events.

How to cite: Logan, G., Cotterill, D., McCarthy, M., Ciavarella, A., Addison, H., Watson, P., and Wetherell, T.: Attribution of convective rainfall events using AI-downscaling – how extreme can we go?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12464, https://doi.org/10.5194/egusphere-egu26-12464, 2026.

EGU26-13814 | ECS | Posters on site | AS5.1

Statistical Calibration of ArchesWeatherGen for Enhanced Sub-Seasonal and Longer Predictions 

Robert Brunstein and Christian Lessig

The capabilities and skill of emerging data-driven weather forecasting and climate models are steadily increasing and significant progress has been made in terms of their quality in the last years. Data-driven weather forecasting models predict the state of the atmosphere for a single step, e.g. 6h. Longer lead times are obtained using time-stepping where predictions are fed back into the model for the next step. Although many models exhibit stable behaviour for long rollouts, the training only considers short trajectories. The trained models are therefore statistically not well calibrated at longer lead times and for phenomena like blocking patterns or teleconnections, which happen on time scales larger than a few days, the predictions are poorly constrained by the training. To address this issue, the training of data-driven models needs to consider information about the atmospheric conditions from several days up to several weeks. 

We approach this problem by using ArchesWeather and ArchesWeatherGen. ArchesWeather provides a deterministic prediction of the next state of the atmosphere. ArchesWeatherGen, a probabilistic flow-matching model, corrects  the deterministic prediction to obtain a probabilistic prediction that matches the ground truth state. We tackle the long lead time calibration problem by applying ArchesWeatherGen after a large number of deterministic forecasting steps, in contrast to the single step used for ArchesWeatherGen for medium-range weather forecasting. We therefore condition ArchesWeatherGen on an entire long forecast trajectory produced by the deterministic model. Through this, ArchesWeatherGen obtains more temporal information about the atmosphere as well as the error development and can explicitly learn longer-time correlation patterns in the atmospheric dynamics. This leads to a better calibrated model at longer lead times. It also reduces the number of diffusion steps, and hence the computational costs, as we only correct the mean prediction after a larger number of deterministic autoregressive forecasting steps. For our study, we examine the influence of the length of the input trajectory and evaluate the improvement of our approach compared to the results obtained with a single step model correction.

How to cite: Brunstein, R. and Lessig, C.: Statistical Calibration of ArchesWeatherGen for Enhanced Sub-Seasonal and Longer Predictions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13814, https://doi.org/10.5194/egusphere-egu26-13814, 2026.

EGU26-15037 | ECS | Orals | AS5.1

Evaluating ArchesWeather and ArchesWeatherGen under Multi-Decadal AMIP-style climate simulations 

Renu Singh, Robert Brunstein, Antonia Anna Jost, Yana Hasson, Guillaume Couairon, Christian Lessig, and Claire Monteleoni

The last 5 years have seen an AI revolution in weather forecasting with data-driven models trained on ERA5 (such as Pangu-Weather, GraphCast) surpassing the skill of numerical models at a fraction of the compute costs . Furthermore, stochastic modeling approaches are now state-of-the-art as they can model the uncertainty in the dynamics of the earth system (GenCast, FGN). Similarly, there have been recent advances in long-term climate emulation using data-driven methods, although they either use deterministic models (ACE2, Lucie) or are trained on simulated climate data from physical models (ArchesClimate). Here, we evaluate a stochastic modeling approach, ArchesWeatherGen, on historical climate timescales (last 40 years) and its response to ocean forcings in an AMIP run setup (atmospheric model forced with sea surface temperature and sea ice). These simulations contribute to AIMIP (AI Model Intercomparison Project), an initiative to organize and compare the current state-of-the-art AI climate models. 

ArchesWeather and ArchesWeatherGen are efficient data-driven models built for medium-range weather forecasting. ArchesWeather is a deterministic transformer-based model and ArchesWeatherGen is a probabilistic generative model based on flow matching, with the same transformer backbone, that corrects the deterministic model prediction and accounts for variability in the time evolution.

In adherence to the AIMIP Stage 1 protocol, we adapt the models to serve as an atmospheric climate model for AMIP climate simulations on the historical period of 1979-2024. ArchesWeather and ArchesWeatherGen are extended to take into account monthly mean forcings for sea surface temperature (SST) and sea ice cover computed from ERA5. These models are trained on daily averaged 1-degree ERA5 data and they predict the state of the atmosphere at a forecast lead time of 24 hours given initial conditions.

We examine the ability of both models to stably emulate the current climate by quantitatively and qualitatively comparing them to the ERA5 climatology. Our results show that the models are able to emulate the current climate faithfully and reproduce many teleconnections as well as modes of annular variability correctly. We ablate different model configurations against each other and investigate the influence of the residual predictions of ArchesWeatherGen on the quality of the climate simulations compared to the deterministic predictions of ArchesWeather. We also analyse the models' capability to reproduce extreme weather statistics. Lastly, we examine the models’ response to forcings by evaluating the stability, trend, and physical correlations when running the model in different forcing scenarios, such as no forcings, annually repeating forcings, and increased SST.

How to cite: Singh, R., Brunstein, R., Jost, A. A., Hasson, Y., Couairon, G., Lessig, C., and Monteleoni, C.: Evaluating ArchesWeather and ArchesWeatherGen under Multi-Decadal AMIP-style climate simulations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15037, https://doi.org/10.5194/egusphere-egu26-15037, 2026.

EGU26-15189 | ECS | Posters on site | AS5.1

How can AI tools be used to explore unprecedented future climate and weather extremes? 

Tom Wood and Tom Matthews

This study addresses recent calls for greater focus on understanding unprecedented extreme events (e.g. Kelder et al., 2025; Matthews et al., in review) by exploring the potential to use downscaled ‘synthetic data’ from climate model projections to train cutting-edge, computationally efficient deep learning models and generate very large ensembles of high-resolution extreme weather events under future perturbed climates. The study seeks to advance understanding of plausible upper limits in extreme high-impact, low-likelihood (HILL), record-shattering extremes and unprecedented tail risks, focusing initially on the threat of uncompensable heat with the potential to result in catastrophic mass mortality impacts. We address a number of open questions in this nascent field by testing a set of recently developed tools in new and innovative ways to understand the benefits and limitations of this approach. 

Can we generate new insights beyond what can be achieved using traditional methods, such as large ensembles of physics-based models and advances such as ensemble boosting? What are the benefits of producing very large stochastic ensembles of plausible extreme weather systems and how does this complement (or otherwise) other approaches with similar motivations (e.g. emulators)? Can we identify and validate plausible physical climate storylines leading to unprecedented extreme events e.g., by identifying and clustering meteorological setups leading to very large, compound, or concurrent non-contiguous regional extremes? Can we robustly constrain this method to ensure physical plausibility in unprecedented climates? Can we advance understanding of rare event probability under a non-stationary climate from various emissions pathways? What are the limitations due to aleatoric and epistemic uncertainty? How do we mitigate biases and limit their propagation? Can we investigate downward counterfactuals and identify meteorological conditions aligning with imagined worst-case scenarios?

By addressing these questions, this study seeks to advance knowledge of the threats posed by the most extreme plausible weather events posing potentially catastrophic risks to society.

How to cite: Wood, T. and Matthews, T.: How can AI tools be used to explore unprecedented future climate and weather extremes?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15189, https://doi.org/10.5194/egusphere-egu26-15189, 2026.

The Western North Pacific Subtropical High (WNPSH) is one of the dominant subtropical anticyclonic circulations over the western North Pacific during boreal summer, strongly influencing East Asian extremes such as tropical cyclone tracks, heatwaves, and the Baiu/Meiyu front. WNPSH variability reflects both midlatitude teleconnections and tropical intraseasonal oscillations (BSISO). Therefore, to clarify predictability, it is essential to identify and quantify how individual events contribute to forecast skill and uncertainty.

We develop a probabilistic deep learning framework to predict a WNPSH index with explicit uncertainty, represented as Gaussian regression outputs (μ, σ), and assess its predictability up to a 1-month lead. We adopt a model that combines a three-dimensional convolutional neural network with self-attention. To capture diverse representations, we pretrain the model using a millennial-scale ensemble dataset from d4PDF and then fine-tune it with the ERA5 reanalysis. As a result, the prediction skill reaches ACC = 0.6 at 10-day lead time. With deep learning models, the prediction problem can be formulated as an explainable AI (XAI) task, in which precursor signals relevant to the forecast can be estimated directly from spatial patterns and input variables (Maeda and Satoh, 2025). Here, we analyze the predictability using a combination of XAI and the concept of windows of opportunity. During opportunity events, forecast skill improves to about a 15-day lead time. Clear precursor patterns emerge in the initial conditions, including signatures of intraseasonal oscillations and midlatitude wave trains. These signals are consistent with heatmap-based interpretations from XAI, providing quantitative statistics on the sources of predictability for prominent events.

How to cite: Maeda, Y. and Satoh, M.: Probabilistic Deep Learning Identifies Windows of Opportunity and Precursors for Western North Pacific Subtropical High Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16518, https://doi.org/10.5194/egusphere-egu26-16518, 2026.

EGU26-16579 | ECS | Posters on site | AS5.1

Data-driven global ocean model resolving atmospherically forced ocean dynamics 

Jeong-Hwan Kim, Daehyun Kang, Young-Min Yang, Jae-Heung Park, and Yoo-Geun Ham

Artificial intelligence has advanced global weather forecasting, outperforming traditional numerical models in both accuracy and computational efficiency. Nevertheless, extending predictions beyond subseasonal timescales requires the development of deep learning (DL)–based ocean–atmosphere coupled models that can realistically simulate complex oceanic responses to atmospheric forcing. This study presents KIST-Ocean, a DL-based global three-dimensional ocean general circulation model. Comprehensive evaluations confirmed the model’s robust ocean predictive skill and efficiency. Moreover, it accurately reproduces realistic ocean responses, such as Kelvin and Rossby wave propagation, and vertical motions induced by rotational wind stress, demonstrating its ability to represent key ocean–atmosphere interactions underlying climate phenomena, including the El Niño–Southern Oscillation. These findings reinforce confidence in DL-based global weather and climate models by demonstrating their capacity to capture essential ocean-atmosphere relationships. Building upon this foundation, the present study paves the way for extending DL-based modeling frameworks toward integrated Earth system simulations, thereby offering substantial potential for advancing long-range climate prediction capabilities.

How to cite: Kim, J.-H., Kang, D., Yang, Y.-M., Park, J.-H., and Ham, Y.-G.: Data-driven global ocean model resolving atmospherically forced ocean dynamics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16579, https://doi.org/10.5194/egusphere-egu26-16579, 2026.

EGU26-16636 | Posters on site | AS5.1

How can climate model emulators be aligned more closely with the needs of applied researchers? 

Nina Effenberger and Luca Schmidt

Earth System Models (ESMs) represent our most comprehensive tools for understanding and projecting climate change impacts; yet, they are highly computationally demanding and technically complex. Climate model emulators offer an alternative approach by approximating components or full ESM outputs at a reduced computational cost. Such emulators can range from reduced-order climate models to fully data-driven machine learning surrogates. As the demand for climate information increases, interest in climate model emulation has grown across both climate science and machine learning research, leading to rapid methodological development. Despite this shared interest, the two research fields remain largely disconnected and the application of machine learning climate emulators in climate science remains challenging [1]. Many emulators, therefore, remain unused in decision-making contexts--not because they lack value, but because methodological developers and users lack a shared framework for communication, evaluation, and practical guidance. 
This work examines this disconnect and takes a step towards facilitating the use of machine learning–based climate emulators in applied research and decision-making. We analyze and contrast methodological and applied perspectives on emulators, identify points of misalignment, and highlight opportunities for improved interaction. Building on these insights, we propose a tutorial-style framework that connects the two perspectives and provides practical guidance for developing, evaluating, and using climate emulators in research and decision-making contexts.

[1] Fowler, H. J., Mearns, L. O. and Wilby, R. L. [2025], Downscaling future climate projections: Compound-
ing uncertainty but adding value?, in ‘Uncertainty in Climate Change Research: An Integrated Approach’,
Springer, pp. 185–197.

How to cite: Effenberger, N. and Schmidt, L.: How can climate model emulators be aligned more closely with the needs of applied researchers?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16636, https://doi.org/10.5194/egusphere-egu26-16636, 2026.

EGU26-17080 | Posters on site | AS5.1

Deep learning-Based Global Ocean prediction model on the HEALPix Mesh 

Seonyu Kang, Yoo-Geun Ham, and Dongjin Cho

While deep learning-based atmospheric have been actively developed, in contrast, the development of ocean prediction models which allows multi-decade simulations through the autoregressive operation has been largely limited. This study developed a deep learning-based global ocean prediction model using the HEALPix grid system that capable of multi-decades integration in daily time step by successfully reproducing the observed global ocean statistics. Model training uses Fourier amplitude and phase losses to preserve low-frequency spatial structure and phase consistency, batch anomaly loss to learn anomalous variability, and sequentially ingests past-to-present atmospheric forcing to enable physically consistent coupled atmosphere–ocean dynamics in long-term integration. Long-term ocean model integration experiments with the observed atmospheric forcing demonstrate drift-free stable climatology for 20-yr simulations, with realistic Niño3.4 variations and ENSO-related global oceanic anomaly patterns consistent with observations. Furthermore, oceanic subsurface temperature responses to the westerly wind bursts (WWBs) over the equatorial western Pacific successfully capture the eastward propagation properties associated with the oceanic Kelvin waves.

How to cite: Kang, S., Ham, Y.-G., and Cho, D.: Deep learning-Based Global Ocean prediction model on the HEALPix Mesh, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17080, https://doi.org/10.5194/egusphere-egu26-17080, 2026.

EGU26-17113 | ECS | Posters on site | AS5.1

Evaluating machine learning approaches to improve observational daily precipitation datasets 

Skye Williams-Kelly, Lisa Alexander, Steefan Contractor, and Sahani Pathiraja

Accurate precipitation predictions are vital for water resource management and risk mitigation. Interpolated precipitation estimates derived from in situ observations are frequently used to evaluate climate models and analyse trends. However, these inadequately represent its spatio-temporal characteristics and significantly smooth out extremes, inhibiting effective evaluation of dynamical models and analysis of trends. Machine learning methods may be suited to addressing these limitations due to their ability to identify patterns in large datasets and use of GPU acceleration. Therefore, we compare three ML-based approaches for improving observational daily precipitation datasets: Gaussian Processes, Bayesian Neural Fields, and Neural Processes. Their performance is evaluated using traditional and distributional metrics, including on out-of-sample prediction, enabling an objective assessment of generalisation skill and representation of extremes. Results are further compared against existing precipitation products to identify the relative strengths and limitations of each method.

How to cite: Williams-Kelly, S., Alexander, L., Contractor, S., and Pathiraja, S.: Evaluating machine learning approaches to improve observational daily precipitation datasets, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17113, https://doi.org/10.5194/egusphere-egu26-17113, 2026.

EGU26-17600 | Orals | AS5.1

Rare event simulations, emulators, machine learning, and Bayesian GEV estimation, for predicting extreme heat waves and extremes of renewable electricity production 

Freddy Bouchet, Dorian Abbot, Laurent Dubus, Pedram Hassanzadeh, Amaury Lancelin, Jonathan Weare, Peter Werner, and Alexander Wikner

In the climate system, extreme events and tipping points (transitions between climate attractors) are of primary importance for understanding the impacts of climate change and for designing effective adaptation and mitigation strategies. Recent extreme heat waves with severe societal consequences, as well as prolonged periods of very low renewable energy production in electricity systems, are striking examples. A key challenge in studying such phenomena is the lack of available data: these events are inherently rare, and realistic climate models are computationally expensive and highly complex. This data scarcity severely limits the applicability of traditional approaches, whether based on modelling, physics, or statistical analysis.

In this talk, I will present new algorithms and theoretical approaches based on rare-event simulations, climate-model emulators, machine-learning methods for stochastic processes, and up to date blend of data and model use to estimate generalized extreme value (GEV) distribution. These methods are specifically designed to predict the probability that an extremely rare event will occur, to produce huge catalogues of dynamical trajectories leading to the event, and to use the best available historical and model data. The rare event simulation/emulator approach combines, on the one hand, state-of-the-art AI-based emulators that reproduce the full atmospheric dynamics of climate models, and, on the other hand, rare-event simulation techniques that reduce by several orders of magnitude the computational cost of sampling extremely rare events. In parallel the Bayesian GEV approach mix information from historical observation and CMIP model output to produce the best possible estimate of extreme event probabilities.

To illustrate the performance of these tools, I will present results on midlatitude extreme heat waves and on extremes of renewable energy production, with a particular focus on their implications for the resilience of electricity systems.

How to cite: Bouchet, F., Abbot, D., Dubus, L., Hassanzadeh, P., Lancelin, A., Weare, J., Werner, P., and Wikner, A.: Rare event simulations, emulators, machine learning, and Bayesian GEV estimation, for predicting extreme heat waves and extremes of renewable electricity production, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17600, https://doi.org/10.5194/egusphere-egu26-17600, 2026.

EGU26-18038 | ECS | Posters on site | AS5.1

Architectural Sensitivity of AI Weather Prediction Models to 3D Structural and Seasonal Climate Forcing 

Mozhgan Amiramjadi, Christopher Roth, and Peer Nowack

Data-driven weather prediction models have demonstrated remarkable skill, yet their ability to maintain a physically consistent three-dimensional atmospheric structure under out-of-distribution (OOD) conditions remains poorly understood. If OOD performance criteria could be met approximately, AI models would open up entirely new possibilities to generate large AI weather ensembles under future climate scenarios—for example, if initialized from climate model simulations (Rackow et al., 2024). This study conducts a multi-scale diagnostic evaluation of four state-of-the-art models—NeuralGCM (a deterministic hybrid model), GraphCast (a deterministic graph neural-network model), AIFS (a deterministic transformer-based model), and GenCast (an ensemble generative and diffusion-based model)—initialized across three distinct climate states: 1955 (cold), 2023 (neutral), sourced from ERA5 reanalysis, and 2049 (warm) simulated by the nextGEMS climate model (Segura et al., 2025).

Over 1–10-day leads, we find no detectable resolution-dependence for NeuralGCM's global skill, though the 1.4° configuration minimizes mean drift. A dominant spatial signature emerges across all models: a robust land–ocean contrast where oceans maintain smaller biases and slower Anomaly Correlation Coefficient (ACC) decay. Cross-hemispheric skill comparisons reveal that this contrast drives a significant asymmetry in error characteristics. In the 2049 warming scenario, the land-heavy Northern Hemisphere (NH, 39% land coverage) is the primary site of GraphCast's systematic "cool-drift" toward its training distribution, which peaks during boreal summer (JJA). In contrast, the generative GenCast model develops a pronounced warm bias localized in the oceanic Southern Hemisphere (SH, with about 20% land coverage).

For all three climate states, we further evaluate model performance across the entire troposphere and, as far as available, the stratosphere. While all four models maintain high variance-explained in the present-day mid-troposphere, performance degrades non-linearly under OOD forcing elsewhere, particularly within the stratosphere (< 200 hPa) and the boundary layer (> 900 hPa). Latitudinal R2-score cross-sections reveal that this degradation is most severe at polar latitudes; notably, in the 2049 scenario, GenCast exhibits a near-total collapse of skill by day 10, whereas NeuralGCM and GraphCast maintain localized predictive skill within the tropical troposphere.

The architecture-dependence of these simulated ensembles is confirmed by projecting day-10 drifts onto inter-climate "fingerprints" (T2049 - T2023 and T1955 - T2023). While AIFS and NeuralGCM show superior stability, GraphCast exhibits a systematic "cool-drift" toward its training climatology, and GenCast develops a distinct warm ocean drift. Beyond evaluating skill in surface variables, our results underline the need to assess data-driven models comprehensively across vertical, hemispheric, and seasonal diagnostics when applied to climate science scenarios, with implications for future AI model development.

References:

Rackow, T., et al (2024). Robustness of AI-based weather forecasts in a changing climate. arXiv preprint  arXiv:2409.18529. https://doi.org/10.48550/arXiv.2409.18529

Segura, H., et al. (2025). nextGEMS: entering the era of kilometer-scale Earth system modeling. Earth system modeling, Geosci. Model Dev., 18, 7735–7761, https://doi.org/10.5194/gmd-18-7735-2025

How to cite: Amiramjadi, M., Roth, C., and Nowack, P.: Architectural Sensitivity of AI Weather Prediction Models to 3D Structural and Seasonal Climate Forcing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18038, https://doi.org/10.5194/egusphere-egu26-18038, 2026.

EGU26-18557 | ECS | Posters on site | AS5.1

Bias-Correcting Arctic ERA5 Surface Air Temperatures using Deep Learning  

Sabine Scholle and Felix Pithan

Bias-Correcting Arctic ERA5 Surface Air Temperatures using Deep Learning 

Fine-tuning AtmoRep, a climate dynamics foundational model for improved Arctic 2m temperature predictions 

Due to the Arctic's harsh environment, comprehensive observational networks remain incomplete, leading to a reliance on biased reanalysis datasets such as ERA5. [1] This study investigates the potential of fine-tuning AtmoRep, a pre-trained transformer model for global atmospheric dynamics, to improve bias correction of Arctic 2-meter temperature (t2m) predictions. [2] 

Our methodology involves fine-tuning AtmoRep using ERA5 fields as input and bias-corrected Arctic t2m synthetic data, from a parallel project, as a target. [3] The project goal is to leverage AtmoReps global climate representations to further push the bias-corrected synthetic Arctic t2m data, given ERA5 as input (evaluated against observational data).

Preliminary results demonstrate stable validation performance of AtmoRep over the Arctic, achieving a t2m RMSE of 0.27 K during fine-tuning. Model robustness was further evaluated under severely masked target fields (up to 90% masking), and comparing BERT-style reconstruction with a forecasting-based training strategy. 

This study represents a novel application of foundation pretrained climate models for bias correction in sparsely observed Arctic regions, highlighting the potential of machine learning approaches to advance atmospheric science. 

  • Tian, T., Yang, S., Høyer, J. L., Nielsen-Englyst, P., & Singha, S. (2024). Cooler Arctic surface temperatures simulated by climate models are closer to satellite-based data than the ERA5 reanalysis. Communications Earth & Environment, 5(1). https://doi.org/10.1038/s43247-024-01276-z 
  • Lessig, C., Luise, I., Gong, B., Langguth, M., Stadtler, S., & Schultz, M. (2023b, August 25). AtmoRep: A stochastic model of atmosphere dynamics using large scale representation learning. arXiv.org. https://arxiv.org/abs/2308.13280 
  • Hossain, A., Keil, P., Grover, H., et al. Machine Learning Eliminates Reanalysis Warm Bias and Reveals Weaker Winter Surface Cooling over Arctic Sea Ice. ESS Open Archive . December 24, 2025.  https://doi.org/10.22541/essoar.176659533.30384251/v1 

How to cite: Scholle, S. and Pithan, F.: Bias-Correcting Arctic ERA5 Surface Air Temperatures using Deep Learning , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18557, https://doi.org/10.5194/egusphere-egu26-18557, 2026.

EGU26-19650 | ECS | Posters on site | AS5.1

Global Evaluation of Probabilistic AI Weather Forecasts Across Extremes and Regimes 

Marc Girona-Mata, Andrew Orr, and Richard Turner

Recent probabilistic machine learning weather forecasting models have demonstrated competitive skill relative to state-of-the-art (SOTA) numerical weather prediction ensemble systems. However, a rigorous global assessment of their skill, particularly in the distribution tails relevant for extremes as well as across different geographical regions, remains limited. Here, we present a systematic evaluation of various SOTA probabilistic AI weather forecasting systems against ECMWF’s Integrated Forecasting System Ensemble (IFS ENS), focusing on forecast skill across the full range of event intensities.

We analyse global forecasts at 24- and 72-hour lead times for near-surface temperature, 10 m wind speed, and total precipitation at 0.25° resolution over the 2024-2025 period. Forecasts are evaluated using the fair Continuous Ranked Probability Score (fCRPS) to account for differing ensemble sizes, as well as other complementary metrics. We also employ the threshold-weighted CRPS (twCRPS) computed for different quantiles ranging from the median up to the one-in-a-million extreme event. Scores are area-weighted and analysed both i) globally, ii) over land only, and iii) for different regions.

AI-based forecasts demonstrate comparable or improved probabilistic skill relative to the IFS ensemble in the bulk of the distribution, with particularly strong performance over tropical and mid-latitude oceans. However, skill systematically degrades at high quantiles for most variables, with more pronounced losses over land and at short lead times. Both diffusion- and CRPS-based probabilistic forecasts are competitive, but their relative skill varies across variables. Spatial diagnostics reveal coherent regime-dependent behaviour, with AI models underperforming in complex terrain and coastal regions where the IFS ENS retains a clear advantage. 

These results highlight both the promise and current limitations of probabilistic AI weather forecasting models, emphasising that headline global skill can mask substantial degradation in extreme-event and regional reliability.

How to cite: Girona-Mata, M., Orr, A., and Turner, R.: Global Evaluation of Probabilistic AI Weather Forecasts Across Extremes and Regimes, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19650, https://doi.org/10.5194/egusphere-egu26-19650, 2026.

EGU26-19652 | ECS | Orals | AS5.1

Using process-based model simulations to develop and validate a data-driven approach for identifying climate drivers of maize yield failure 

Lily-belle Sweet, Christoph Müller, Jonas Jägermeyr, Weston Anderson, and Jakob Zscheischler

Climate impacts such as crop yield failure arise from complex combinations of weather conditions acting across multiple time scales, making it challenging to identify the most relevant climate drivers from high-resolution weather data. However, with data limitations, and the existence of complex and interacting relationships between growing-season climate conditions and plant growth, complex machine learning models that show high performance in predicting crop yield are often ‘right for the wrong reasons’. Process-based crop model simulations, which embody known functional relationships, could provide a useful testbed for developing and evaluating more trustworthy and robust methods. We present a novel two-stage, data-driven framework designed to extract a parsimonious set of climate drivers from multivariate daily meteorological inputs by systematically generating, evaluating and discarding candidate features using machine learning and then producing a set of drivers that are robust across locations, years and predictive feature combinations. We first validate the method using simulated U.S. maize yield failure data from two global gridded crop models, using rigorous out-of-sample testing: training on only early 20th-century data and holding out over 70 subsequent years for evaluation. The drivers identified using our approach align with known crop model mechanisms and rely solely on model input variables. Parsimonious logistic regression models built from these drivers achieve strong predictive skill under non-stationary climate conditions.

After validating the methodology on simulated data, we apply the same approach to observed county-level yields and daily multivariate weather data in rainfed and irrigated US maize systems. We identify compact sets of five climate drivers that effectively reproduce interannual variability and major historic failure events, including the 1993 Midwest floods and the 2012 drought. In rainfed systems, yield failure risk is strongly associated with extended periods of high soil moisture conditions after establishment, seasonal precipitation levels and vapor pressure deficit (VPD), with more than 40 high-VPD days between flowering and maturity markedly increasing odds of yield failure. In irrigated systems, critical drivers include soil moisture conditions surrounding planting, hot or dry days after establishment, and dewpoint temperatures near harvest. Our results demonstrate the transferability of the method from simulations to observations, and suggest its applicability to other crops, locations and further climate-related impacts. By avoiding reliance on post-hoc interpretability of black-box models, this framework enables the use of inherently interpretable, statistical models while still leveraging the predictive power of high-dimensional meteorological datasets.

How to cite: Sweet, L., Müller, C., Jägermeyr, J., Anderson, W., and Zscheischler, J.: Using process-based model simulations to develop and validate a data-driven approach for identifying climate drivers of maize yield failure, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19652, https://doi.org/10.5194/egusphere-egu26-19652, 2026.

EGU26-20173 | ECS | Posters on site | AS5.1

Exploring Adversarial Attacks in AI Weather Models for Generation of High-resolution Tropical Cyclones 

Marco Froelich and Sebastian Engelke

There has been recent interest in the advantage of differentiability of AI-weather models to enable direct computation of model sensitivities to initial conditions. In the field of machine learning, adversarial attacks leverage these sensitivities to influence the output of the prediction system by finding optimal initial condition perturbations. In weather forecasting, this methodology can be seen under two lenses: differentiable models are susceptible to malicious attacks aimed at distorting operational forecasts [1], while having access to sensitivities is an opportunity to further our understanding of real events through the generation of synthetic forecasts. Adversarial examples - perturbed initial conditions obtained from adversarial attacks - have been used in [2] to create even more extreme forecasts of a heatwave, providing a storyline approach to understanding black swan heatwave events. 

We further this effort by exploring adversarial attacks of tropical cyclone predictions at 0.25° resolution using Operational GraphCast. Although AI-weather models are known to improve tropical cyclone track predictions against numerical systems it remains challenging to forecast high intensities, particularly at high-resolution. Indeed, AI-weather models trained with MSE-type losses on reanalysis are known to suffer from 'blurred' forecasts due to the implicit down-weighing of small scale features. We find that while standard adversarial attacks of tropical cyclone forecasts are effective in controlling tropical cyclone tracks, they fail to reproduce realistic gradients of temperature, geopotential and wind fields, effectively worsening blurring effects. This is true also for attacks on the AMSE-finetuned Operational GraphCast model [3] which otherwise shows significant improvements in representing small scale features. We then borrow insights from the machine learning literature on the impact of the low-frequency bias of neural networks and its relationship to adversarial examples to improve this limitation and explore the capabilities of AI-weather models in global high-resolution tropical cyclone forecasting. 

 

References: 
[1] Imgrund, E., Eisenhofer, T., Rieck, K., 2025. Adversarial Observations in Weather Forecasting.
[2] Whittaker, T., Luca, A.D., 2025. Constructing Extreme Heatwave Storylines with Differentiable Climate Models.
[3] Subich, C., Husain, S.Z., Separovic, L., Yang, J., 2025. Fixing the Double Penalty in Data-Driven Weather Forecasting Through a Modified Spherical Harmonic Loss Function.

How to cite: Froelich, M. and Engelke, S.: Exploring Adversarial Attacks in AI Weather Models for Generation of High-resolution Tropical Cyclones, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20173, https://doi.org/10.5194/egusphere-egu26-20173, 2026.

EGU26-20724 | Orals | AS5.1

Machine learning identification of dry Intrusion outflows in present and future climates 

Jennifer Catto, Owain Harris, Stefan Siegert, and Shira Raveh-Rubin

Dry intrusions (DIs) are the key descending airstreams within extratropical cyclones. They can exacerbate the impacts of mid-latitude weather systems through their interactions with the boundary layer, enhancing atmosphere-surface interactions, and affecting frontal precipitation. DIs have been identified in the past using Lagrangian trajectory analysis, which has enabled studies into the climatology, variability, and characteristics of these airstreams. However, the potential futures of DIs, and the impact of climate change on them, has been unexplored due to the computational and data demands of this approach.

In this work, a convolutional neural network – DI-Net – is trained to identify DI outfow objects from a Lagrangian-identified dataset across the Northern Hemisphere, using information on relative and specific humidity, and topography from ERA5. The model performs well at capturing the main features of the DI climatology. DI-Net is then applied to historical and future climate model data from MRI-ESM2.0 to evaluate the climate model and investigate future changes. We present some of the challenges associated with developing a machine learning model for use with climate data.

The climate model represents the frequency of DIs well. In the most extreme warming scenario (SSP5-8.5), the frequency of DI outflows decreases in general, with increases across western Europe, consistent with the projections of the extratropical stormtracks seen in CMIP6 models. This study demonstrates the utility of the machine learning model to allow us to investigate the future of DIs, and eventually to understand more about how their impacts may change.

How to cite: Catto, J., Harris, O., Siegert, S., and Raveh-Rubin, S.: Machine learning identification of dry Intrusion outflows in present and future climates, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20724, https://doi.org/10.5194/egusphere-egu26-20724, 2026.

EGU26-21303 | Posters on site | AS5.1

Multiscale Graph Neural Networks for Climate Data Analysis 

Étienne Plésiat, Maximilian Witte, Johannes Meuer, and Christopher Kadow

We present a flexible deep learning framework for climate data analysis that leverages message-passing graph neural networks.

The framework is fully configurable and allows users to construct diverse architectures. In particular, it supports encoder-processor-decoder configurations in which geophysical fields are mapped onto a hierarchy of multi-icosahedral meshes, enabling information to propagate across scales before being mapped back to the original spatial grid. The model architecture is defined through a set of graph operators, including transformer-based graph convolutions. The framework operates on both regular and irregular grids, and enables flexible multivariate processing with spatial consistency. It further incorporates adaptive graph connectivity, enabling robust handling of missing data through dynamic edge construction. Additionally, several explainable AI (XAI) techniques are integrated to facilitate interpretation and physical attribution.

These features make the framework suitable for a broad range of climate and Earth-system applications, including data infilling, downscaling and process attribution. Its capabilities are illustrated through two case studies: (i) the reconstruction of global precipitation fields from incomplete observations, with comparison to established statistical and deep learning methods, and (ii) the attribution of large-scale drivers contributing to an extreme heatwave event.

The framework is currently being deployed as a web processing service that supports operational inference for selected climate applications.

How to cite: Plésiat, É., Witte, M., Meuer, J., and Kadow, C.: Multiscale Graph Neural Networks for Climate Data Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21303, https://doi.org/10.5194/egusphere-egu26-21303, 2026.

EGU26-21336 | ECS | Posters on site | AS5.1

Performance of Spatiotemporal Causal Effect Estimation in Coupled Climate Models 

Rebecca Herman and Jakob Runge

Climate scientists are increasingly exploring the possible applications of artificial intelligence to climate modeling, whether for use inside the model to replace parameterized model components, or for use separately as an emulator of observed or simulated climate. However, a major limitation of standard artificial intelligence techniques is that they cannot distinguish between statistical association and causality. While this is not a drawback for the purpose of statistical prediction in an unchanging system, it can pose a problem for generalization of parameterizations and emulators under climate change, and furthermore, it means that it is not sound to use such techniques to predict the response of the climate system to unobserved interventions, including proposed climate engineering initiatives. The framework of causal inference attempts to address this limitation, providing techniques for discovering qualitative (“discovery”) and quantitative (“effect estimation”) information about the system’s response to interventions from purely observational data (or imperfect experiments) using causal reasoning. However, it was not originally developed for application to spatiotemporal dynamical systems such as the climate system.

In previous work, we develop a unified framework for causal effect estimation in spatiotemporal dynamical systems. In contrast to the hard interventions on univariate representations of coupled climate phenomena that until now have been more commonly used, our framework allows the user to investigate the effect of a spatiotemporal perturbation on a climate variable in one finite region on another variable in a different finite region at another time after specifying the qualitative causal relationships between the regions as a whole. This framework advances causal effect estimation for climate science because spatiotemporal perturbations are better defined, more actionable, and more interpretable than hard interventions on conceptual climate phenomena.

Here, we evaluate its performance using CMIP6-class models, focusing initially on the effect of the El Niño Southern Oscillation (ENSO) on the North Atlantic Oscillation as an example query. We assess the robustness of the method to data sample size, resolution, and other methodology choices by comparing the causal effect for a given model calculated from different subsets of its pre-Industrial control simulation using various amounts of spatial data and various values of other parameters of the algorithm. We use these results to assess the expected uncertainty on any inferences made using this technique from the short observational record or CMIP6 historical simulations, and make recommendations for best practices in different circumstances. Finally, we evaluate the accuracy of the predictions by using a causal model trained on historical simulations to predict the output of Tropical Basin Interaction Model Intercomparison Project experiments from the same climate model that nudge Pacific Sea Surface Temperature in the ENSO region in a manner comparable to our perturbation intervention.

How to cite: Herman, R. and Runge, J.: Performance of Spatiotemporal Causal Effect Estimation in Coupled Climate Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21336, https://doi.org/10.5194/egusphere-egu26-21336, 2026.

EGU26-511 | ECS | Orals | CL5.8

Satellite-based detection of agricultural flash droughts and their ecosystem impacts in southeastern South America 

Lumila Masaro, Miguel A. Lovino, M. Josefina Pierrestegui, Gabriela V. Müller, and Wouter Dorigo

Flash droughts are rapid-onset events that develop within weeks, imposing severe and often unexpected impacts on agriculture. Their monitoring remains challenging due to several factors, including the scarcity of root-zone soil moisture (RZSM) observations and the lack of methodological consensus. This study has two main objectives: (1) to evaluate the applicability of the European Space Agency Climate Change Initiative Combined Root-Zone Soil Moisture product (ESA CCI COM RZSM) for detecting agricultural flash droughts (AFDs) across southeastern South America (SESA), and (2) to assess how satellite-based indicators obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) capture their physical evolution and agricultural impacts.

We apply two complementary AFD detection frameworks to ESA CCI COM and ERA5 RZSM data for 1979–2022: a statistical percentile-based approach and a physically based formulation derived from the Soil Water Deficit Index (SWDI). The percentile method detects AFDs as rapid transitions from above-normal to below-normal soil moisture. The SWDI identifies events through shifts from near-optimal water availability to physiological stress based on soil hydraulic properties. To evaluate agricultural impacts, we analyze satellite-derived evapotranspiration (EVT) and vegetation indicators from MODIS for two representative events in central-eastern and northern SESA. Vegetation indicators include the Land Surface Water Index (LSWI), fraction of absorbed Photosynthetically Active Radiation (fPAR), and Gross Primary Productivity (GPP).

Our results suggest that AFD detection is strongly conditioned by both methodological framework and dataset characteristics. The percentile-based approach tends to overestimate AFD occurrence in persistently wet or dry regimes, where small fluctuations are amplified after percentile transformation. In contrast, the SWDI-based approach preserves regional hydroclimatic gradients and provides a physically consistent representation of plant water stress. Regarding the dataset, ESA CCI COM RZSM captures the main spatial patterns and seasonal cycles of soil moisture depicted by ERA5 across SESA. However, it exhibits smoother short-term variability, delayed drying, and lower absolute soil moisture than ERA5, which could be attributed to the empirical filtering used to propagate surface signals into deeper layers.

Satellite-derived indicators effectively capture the evolution of AFDs across SESA. Soil moisture depletion is followed by reductions in EVT as ecosystems transition from energy- to water-limited conditions. Vegetation indicators respond shortly thereafter: LSWI reveals declining canopy water content, fPAR shows reduced photosynthetic activity, and GPP reflects suppressed ecosystem productivity. The magnitude and spatial extent of these impacts depend on antecedent soil moisture and land-cover type, highlighting the importance of background conditions in modulating drought severity.

Overall, the results demonstrate that ESA CCI COM RZSM provides valuable information for regional AFD monitoring when its physical limitations are considered. The coherence among soil moisture, surface fluxes, and biological responses highlights the potential of satellite observations to track the onset, intensification, and agricultural consequences of AFDs. These results strengthen the use of multi-sensor satellite systems for operational early-warning applications and impact assessment across climate-sensitive agricultural regions such as SESA.

How to cite: Masaro, L., Lovino, M. A., Pierrestegui, M. J., Müller, G. V., and Dorigo, W.: Satellite-based detection of agricultural flash droughts and their ecosystem impacts in southeastern South America, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-511, https://doi.org/10.5194/egusphere-egu26-511, 2026.

EGU26-1232 | ECS | Orals | CL5.8

Evaluating Divergent Evapotranspiration Feedbacks to Warming Across Water- and Energy-Limited Regimes 

Marco Possega, Emanuele Di Carlo, Annalisa Cherchi, and Andrea Alessandri

Land–atmosphere coupling is a central driver of climate variability and extremes, yet Earth System Models (ESMs) struggle to capture the complex interplay between hydrology, vegetation, and surface energy fluxes. In particular, the evapotranspiration–temperature (ET–T) feedback—a key mechanism linking soil moisture, vegetation water use, and near-surface climate—is poorly constrained, limiting confidence in projections of heat extremes and ecosystem stress. Here, we first assess ET–T feedback across a suite of post-CMIP6 ESMs for the historical period (1980–2014) as compared with available GLEAM observations; thereafter the ET-T feedback is investigated in a set of future idealized warming scenarios spanning multiple global temperature targets. To identify the physical and ecohydrological regimes controlling feedback strength, we apply the Ecosystem Limitation Index (ELI), which distinguishes energy-limited from water-limited conditions. Our results reveal a strong negative ET–T feedback in energy-limited regions, where evapotranspiration efficiently cools the surface and stabilizes temperature. In contrast, the feedback reverses in water-limited and transitional regions: here, worsening soil-moisture deficits suppress evaporation and reduce evaporative cooling, thereby amplifying surface warming. Comparison with GLEAM observations highlights regions where models succeed and fail in capturing these feedbacks, particularly in semi-arid ecosystems where land–atmosphere coupling is strongest. Future warming scenarios indicate an expansion of water-limited regimes, weakening negative ET–T feedbacks and reducing the ability of land surface to buffer temperature variability. This shift implies an increased risk of persistent heat extremes, stronger land-surface amplification of warming, and eco-hydrological transitions in sensitive regions. The findings of this study suggest priorities for next-generation ESMs: better representation of soil moisture dynamics, vegetation water-use strategies, and hydrological constraints.  

How to cite: Possega, M., Di Carlo, E., Cherchi, A., and Alessandri, A.: Evaluating Divergent Evapotranspiration Feedbacks to Warming Across Water- and Energy-Limited Regimes, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1232, https://doi.org/10.5194/egusphere-egu26-1232, 2026.

Air pollutants can penetrate deep into the lungs, enter the bloodstream, and trigger a cascade of cardiovascular diseases. Elevated pollutant levels in cities are often associated with heavy traffic and industrial emissions, highlighting the need for effective mitigation strategies. Street trees can reduce air pollution through dry deposition, whereby particles are captured by tree canopies in the absence of precipitation. However, city-level models typically assume uniform deposition rates and neglect location-specific variation in tree benefits. Here, we designed a social-ecological systems approach (SES) and revealed substantial spatial disparities in tree-derived air quality benefits within a city. We found that communities with lower urban canopy received fewer air quality benefits. To address these differences, priority tree planting sites were determined using a stepwise framework that takes into account both neighbourhood-level population exposure and social vulnerability. Our findings demonstrate the uneven distribution of urban ecosystem services, emphasizing the importance of integrating environmental justice into urban forestry planning and provide practical guidance on optimizing planting for reducing population exposure to air pollutants. 

How to cite: Cui, S. and Adams, M.: Unequal Canopies, Unequal Benefits: Environmental Justice Implications of Street Tree Air Pollution Mitigation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2092, https://doi.org/10.5194/egusphere-egu26-2092, 2026.

EGU26-2682 | ECS | Posters on site | CL5.8

Constraining Flash Drought Projections Through Land-Atmosphere Coupling 

Yumiao Wang and Yuan Xing

The increasing drought onset speed is driving a global transition toward more frequent flash droughts, presenting unprecedented challenges for drought management and adaptation. However, projected changes in future flash drought characteristics show considerable divergence among climate models. Here, using models from the Coupled Model Intercomparison Project Phase 6 (CMIP6), we demonstrate that models capable of capturing the land-atmosphere coupling gradient between dry and wet soil conditions tend to project more pronounced global transition from slow to flash droughts in the future. This emergent relationship provides a robust constraint for future projections based on observed land-atmosphere coupling characteristics. Our analysis suggests that the societal and environmental risks posed by future flash droughts could be more severe than previously projected. Given the widespread impacts of flash droughts, this study not only enhances our understanding of uncertainties in drought projections, but also holds promise for supporting socio-economic planning and adaptation strategies through constrained projection.

How to cite: Wang, Y. and Xing, Y.: Constraining Flash Drought Projections Through Land-Atmosphere Coupling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2682, https://doi.org/10.5194/egusphere-egu26-2682, 2026.

In 2024, an exceptionally severe abrupt drought-to-flood transition (ADFT) event occurred over Henan Province in central China, causing substantial economic losses due to its abruptness and limited early warning. Although intraseasonal oscillations (ISOs) can provide precursors for forecasting extremes, previous studies have primarily focused on floods or droughts in isolation, leaving the synergistic impacts of multiple ISO modes on drought-to-flood transitions poorly understood. Here we show that the 2024 ADFT event was jointly modulated by two ISO modes with opposite propagation directions. During the drought stage, Rossby wave train maintained a Ural blocking pattern and displaced the westerly jet southward. This circulation configuration suppressed precipitation while enhancing temperature and sensible heat, leading to persistent drought conditions. During the transition-to-flood stage, both the Rossby wave train and the Western Pacific Subtropical High (WPSH) oscillation acted in concert. The southeastward-propagating Rossby wave train disrupted the blocking, while the WPSH oscillation migrated northwestward. Their combined effects shifted the rain belt northward, strengthened southerly moisture transport, increased latent heating, and ultimately triggered the extreme flood. The synergy between these two ISO modes amplified the transition magnitude by 50%, suggesting that the ADFT event would have been largely suppressed in the absence of their concurrent influence. These results underscore critical role of ISO phase evolution and propagation in ADFT events, and suggest that they may serve as useful precursors for forecasting abrupt transitions.

How to cite: Zhou, S. and Yuan, X.: The impact of intraseasonal oscillations on the 2024 abrupt drought-to-flood transition over central China, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2684, https://doi.org/10.5194/egusphere-egu26-2684, 2026.

EGU26-3979 | Orals | CL5.8

Assessing Canopy and Roughness‑Sublayer Turbulence Representation in Noah‑MP over Forest and Grassland at Lindenberg (Germany) 

Kirsten Warrach-Sagi, Frank Beyrich, Cenlin He, and Ronnie Abolafia-Rosenzweig

Land–atmosphere exchange in tall canopies is strongly controlled by turbulence within and above the canopy and in the roughness sublayer (RSL), where classical Monin–Obukhov similarity theory (MOST) is known to be imperfect. Recent developments in the Noah‑MP land surface model (LSM) include a unified turbulence parameterization that aims to provide a consistent treatment of turbulence from within the canopy, through the RSL, to the surface layer (Abolafia‑Rosenzweig et al., 2021). While this scheme has been tested primarily under snow‑dominated conditions, its performance for non‑snow, multi‑canopy environments over long time periods remains largely unexplored.

Here, we evaluate the unified canopy–RSL turbulence parameterization in Noah‑MP (version 5.1.1) using multi‑year, multi‑level observations from the Lindenberg observatory of the German Meteorological Service (DWD). We focus on two contrasting sites: (i) Kehrigk, a tall evergreen needleleaf forest canopy where RSL effects are expected to be strong, and (ii) Falkenberg, a short grassland site that more closely conforms to MOST assumptions. Both sites provide continuous 30‑min data since 2005, including eddy‑covariance fluxes of sensible and latent heat, radiation components, soil heat flux at 5 cm depth, skin temperature, and multi‑level profiles of air temperature, humidity, and wind speed up to 30 m (forest) and 10 m (grassland). All forcing and flux data undergo standard DWD quality control procedures.

Noah‑MP is run offline at both sites with identical land and soil parameterizations, driven by observed meteorology. We compare a standard configuration (MOST‑based surface‑layer and canopy treatment) with the unified canopy–RSL turbulence configuration. Beyond standard flux evaluation, we will diagnose friction velocity, Monin–Obukhov length, bulk transfer coefficients for heat and moisture, and the vertical structure of wind and temperature in the surface and roughness sublayers. Model performance will be analysed as a function of season, canopy type, and atmospheric stability.

By linking detailed, long‑term observations to alternative turbulence representations in a widely used LSM, this study aims to clarify under which conditions enhanced canopy–RSL formulations improve land–atmosphere coupling in next‑generation Earth System Models.

How to cite: Warrach-Sagi, K., Beyrich, F., He, C., and Abolafia-Rosenzweig, R.: Assessing Canopy and Roughness‑Sublayer Turbulence Representation in Noah‑MP over Forest and Grassland at Lindenberg (Germany), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3979, https://doi.org/10.5194/egusphere-egu26-3979, 2026.

Terrestrial water storage (TWS) is a key variable in the water cycle, and accurate estimation of TWS is crucial for understanding hydrological processes and improving hydrological prediction. In this study, we develop an AI-based data assimilation method for GRACE TWS observations, aiming to integrate the advantages of satellite observations and land surface models. The assimilation adopts the ResUnet model combined with a self-supervised learning strategy. Specifically, the ResUnet model is used to extract large-scale variation information from GRACE TWS observations and high-resolution information from the land surface model. This assimilation system is applied to the NoahMP land surface model for long-term simulation, and the performance is compared with the nudging method. Results show that the AI-based assimilation method is more conducive to depicting fine-scale hydrological processes. Quantitative evaluation indicates that the assimilation effect of the proposed method is superior to that of the nudging. In addition, validation against in-situ observations confirms the rationality and reliability of the proposed method, as it can more accurately estimate terrestrial water storage and related hydrological variables. In the future, this AI-based assimilation method can be extended to the assimilation of more hydrological variables and multi-source observations, which is expected to further improve the estimation capability of land surface hydrological variables and provide more reliable data support for water resource management.

How to cite: Zhu, E. and Wang, Y.: An AI-Based GRACE Terrestrial Water Storage Data Assimilation Improves Hydrological Simulation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4437, https://doi.org/10.5194/egusphere-egu26-4437, 2026.

The rapid development of numerical weather prediction (NWP) models offers new opportunities for improving quantitative precipitation forecasting, while raising challenges in objectively integrating multi-model forecasts. This study presents recent advances in an operational multi-model integration precipitation forecasting method based on the generalized Three-Cornered Hat (TCH) theory.Seven NWP models routinely operated at the National Meteorological Center of the China Meteorological Administration are considered, including ECMWF, GERMAN, NCEP, GRAPES_3KM, BEIJING_MR, GUANGZHOU_MR, and SHANGHAI_MR. The method applies TCH theory to estimate the relative error characteristics of precipitation forecasts from different models. A Bayesian framework is then used to derive objective, model-dependent weighting coefficients, enabling short-range multi-model integration forecasts.The integration performance is evaluated using Threat Score (TS) metrics for 2025. Results show that the TCH-based integration consistently outperforms the single ECMWF model across all precipitation categories. The 24-hour heavy rainfall TS reaches 0.2357, a 48% improvement, while the TS for extreme rainfall events reaches 0.1354, a 141% improvement relative to ECMWF.The multi-model integration products have been operationally implemented at the National Meteorological Center, providing critical support during high-impact weather events, highlighting both recent advances and remaining challenges in operational multi-model precipitation forecasting.

How to cite: chen, S.: Multi-model Integration Precipitation Forecasting Based on TCH Theory: Recent Advances and Challenges, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6074, https://doi.org/10.5194/egusphere-egu26-6074, 2026.

Ecosystem water use efficiency (WUE), an indicator of the trade-off between carbon uptake and water loss, is widely used to assess ecosystem responses to climate change. However, large-scale studies of WUE typically assume a single, fixed lag or accumulation period of climatic drivers across regions. This static assumption neglects spatially heterogeneous temporal responses of WUE to climate, potentially biasing attribution analyses and reducing predictive skill. Here, we developed a pixel-level model to quantify the temporal effects of climatic drivers on WUE by explicitly accounting for no-effect, lagged, cumulative, and combined effects and allowing effect timescales to vary spatially. We found that more than 80% of pixels across China exhibited lagged and/or cumulative effects for each driver, with distinct temporal effect patterns among vegetation types and drivers. In herbaceous cover croplands, precipitation exhibited the shortest lag (0.31 ± 0.56 months) and the longest accumulation time (1.71 ± 0.96 months). Accounting for these spatially heterogeneous temporal effects increased the explanatory power of climatic drivers for WUE variation by 17.7% compared with models without temporal effects. We further showed that for most vegetation types, precipitation and air temperature were more strongly associated with temporal variation in WUE, whereas solar radiation contributed more to spatial variability. These findings indicate that location-specific temporal effects can modulate the climatic controls on WUE. Our framework is readily applicable beyond China and can support a shift toward dynamic climate responses in climate–ecosystem interaction modeling, thereby improving forecasts of ecosystem dynamics and informing climate-adaptive vegetation management.

How to cite: Jiao, X.: Widespread Time-Lagged and Cumulative Effects Modulate Climatic Controls on Ecosystem Water Use Efficiency , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6580, https://doi.org/10.5194/egusphere-egu26-6580, 2026.

Abstract:To address the challenge of simulating runoff in ungauged regions, a hybrid physical–data-driven framework was developed by coupling Soil and Water Assessment Tool (SWAT) with an LSTM–Transformer. SWAT-derived process variables were fused with meteorological forcing to form a physically informed feature set for the Transformer-enhanced LSTM. The framework was first calibrated at a gauged station and then transferred to ungauged basins to evaluate its spatial generalizability. At the gauged station, the SWAT–LSTM–Transformer achieved the highest accuracy among all tested models, yielding an NSE of 0.587 and an R² of 0.728 on the validation dataset. It also maintained a better balance between calibration fit and validation robustness than SWAT–LSTM, SWAT–RF, SWAT–SVM, and stand-alone SWAT. SHAP-based interpretation revealed stable and hydrologically coherent predictor dependencies: temperature, lateral flow, and evaporation emerged as dominant drivers of the model’s runoff simulations, whereas precipitation and soil moisture exerted shorter-term and event-focused influences. When transferred to ungauged stations in the same watershed, the model reproduced seasonal runoff variations and event-scale fluctuations with high accuracy, with NSE ranging from 0.80 to 0.94 and R² from 0.83 to 0.92. Under cross-watershed transfer, the model continued to capture the main temporal patterns, with NSE and R² ranging from 0.62 to 0.83 and 0.60 to 0.84, respectively, although performance declined during extreme events. Overall, the coupled SWAT–LSTM–Transformer framework provides a robust and transferable approach for daily runoff simulation in data-scarce watersheds.

Key words: SWAT; LSTM-Transformer; runoff simulation; ungauged watersheds

How to cite: Peng, Z., Li, Y., and Liu, D.: An interpretable daily runoff simulation method in data-scarce watersheds by coupling SWAT and LSTM-Transformer, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7092, https://doi.org/10.5194/egusphere-egu26-7092, 2026.

EGU26-7919 | ECS | Posters on site | CL5.8

A dynamic representation of wetlands for the ISBA land surface model 

Lucas Hardouin, Bertrand Decharme, Jeanne Colin, and Christine Delire

Wetlands play a critical role in terrestrial hydrology and land–atmosphere exchanges, yet they remain poorly represented in many land surface models. Most approaches rely on static wetland maps, preventing models from capturing hydrological variability and associated feedbacks. Here we introduce a new dynamic wetland scheme in the ISBA land surface model, combining explicit hydrological processes with an annually varying diagnostic of wetland extent.

Wetland extent is computed using a TOPMODEL-based approach that links grid-cell saturation deficit with sub-grid topographic indices, and includes a correction for soil organic content to better represent peat-rich areas. Hydrological properties of wetlands and sub-grid runoff redistribution allow water to accumulate and persist in saturated zones, influencing the overall grid-cell water budget.

Simulated wetland extent shows good spatial agreement with multiple satellite-derived wetland datasets across a range of climate zones. Hydrological evaluation against GRACE-based terrestrial water storage and observed river discharge indicates that dynamic wetlands exert a modest but physically consistent influence on ISBA hydrology: they adjust discharge timing and magnitude without degrading model skill, while increasing grid-cell water storage and associated evapotranspiration. However, regional patterns of simulated evapotranspiration reveal a strong sensitivity to the assumed wetland vegetation type, underscoring the need for improved vegetation representation.

In particular, the dynamic wetland extent opens new opportunities for simulating wetland biogeochemistry, including methane emissions, and for exploring the key role of soil oxygen availability in controlling greenhouse gas fluxes.

How to cite: Hardouin, L., Decharme, B., Colin, J., and Delire, C.: A dynamic representation of wetlands for the ISBA land surface model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7919, https://doi.org/10.5194/egusphere-egu26-7919, 2026.

EGU26-8456 | ECS | Posters on site | CL5.8

C4MIP Multi-Model Projections of Moisture Convergence and Extreme Precipitation Risks over East Asia 

Nayeon jeon, Rackhun Son, and Dasom Lee

As extreme precipitation events intensify under climate change, understanding changes in precipitation patterns over East Asia has become increasingly important. While most future projections have relied on CMIP6 models, the Coupled Climate Carbon Cycle Model Intercomparison Project (C4MIP) integrates terrestrial–oceanic carbon cycle feedback including nitrogen deposition and biogeochemical processes to enhance the reliability of climate projection. Despite these advancements, C4MIP has been underutilized in hydrological assessments for East Asia. In this study, we analyze precipitation patterns over East Asia during the historical period (1980–2014) using a C4MIP multi-model ensemble and evaluate model performance through comparison with reanalysis datasets. The C4MIP ensemble demonstrates improved skill in capturing seasonal and interannual patterns of vertically integrated moisture flux convergence (VIMFC), particularly during periods of pronounced moisture convergence and divergence. Under the SSP5–8.5-bgc scenario, projection indicate intensified moisture convergence and increased risks of extreme precipitation over southeastern China and North Korea. These findings provide a diagnostic evaluation of C4MIP's hydrological performance and offer valuable insights for future regional climate projections and adaptation strategies.

 

This work was funded by the Korea Meteorological Administration Research and Development Program under Grant RS-2024-00404042 and the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2024-00343921).

How to cite: jeon, N., Son, R., and Lee, D.: C4MIP Multi-Model Projections of Moisture Convergence and Extreme Precipitation Risks over East Asia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8456, https://doi.org/10.5194/egusphere-egu26-8456, 2026.

EGU26-9964 | ECS | Posters on site | CL5.8

How do climate factors influence plant-based carbon sequestration in land surface model, and how does this change under global warming? 

He-Ming Xiao, Daniele Peano, Simone Mereu, and Antonio Trabucco

Gross primary production (GPP) is an important indicator of carbon uptake by ecosystems, and plants play a central role in ecosystem carbon sequestration. Understanding how plant-driven GPP fluctuates from year to year and which climate factors control these fluctuations is essential for assessing carbon sequestration. In addition, how carbon sequestration by these plants responds to a warming climate is still not well understood. The lack of high-resolution, well-networked, and long-term stable observations, together with mixed signals from land–atmosphere interactions, makes it difficult to identify and isolate the climate factors influencing plant-driven GPP from an observational perspective. In contrast, land surface models provide an alternative approach to addressing these limitations.

In this study, we conducted 5-km resolution simulations using a land surface model (Community Land Model Version 5, CLM 5, Lawrence et al., 2019) forced with high-resolution atmospheric datasets and updated land surface data covering the Italy and the western Mediterranean region. The high-resolution simulations allow for improved discrimination among different land types, such as urban areas and natural vegetation. We further articulated implementation of Corine land-cover data to better represent current land surface conditions and distribution of Plant Functional Types (PFT). Remarkable progress in the last years has increased representation of more and more complex processes incorporating, among others, plant and soil hydrological and carbon cycles, physiological and phenological processes, land surface heterogeneity and PFT parameterization in LSM. However, large limitations still remain due to uncertainties in representation of spatial and temporal dynamics of model parameters, sub-grid heterogeneity, and ultimately resolving optimal allocation and ecosystem functioning at small scales.  Mediterranean regions were selected as the focus of this study because, as climate change hotspot, they experience strong variability of ecosystem processes and dependencies to changing climate and to increasing severe drought-heatwaves compound events, making vegetation-based mitigation practices particularly urgent. 

We found that both temperature and precipitation play dominant roles in shaping interannual variations in GPP. Under cold or dry regimes, warmer temperatures and higher precipitation are beneficial for higher GPP. In contrast, under warm and wet regimes, further increases in temperature and precipitation are not beneficial for plant GPP production. We further used the model to identify suitable temperature and precipitation ranges for the growth of different plant types, and to examine how global warming is altering these ranges. Our analysis may provide implications for future afforestation practices, particularly in selecting forest types and specific climate/geographic zones that can achieve better carbon sequestration under a warming climate.

How to cite: Xiao, H.-M., Peano, D., Mereu, S., and Trabucco, A.: How do climate factors influence plant-based carbon sequestration in land surface model, and how does this change under global warming?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9964, https://doi.org/10.5194/egusphere-egu26-9964, 2026.

EGU26-11167 | Posters on site | CL5.8

An introduction to the EarthRes program 

Xing Yuan, Justin Sheffield, Ming Pan, Jonghun Kam, Xiaogang He, Joshua Roundy, Nathaniel Chaney, Niko Wanders, Linying Wang, Chenyuan Li, and Yi Hao

The High-Resolution Earth System Modeling, Analysis and Prediction for a Society Resilient to Hydrometeorological Hazards (EarthRes) is a program of the International Decade of Sciences for Sustainable Development (IDSSD), endorsed by UNESCO in 2025. EarthRes aims to build global societal resilience to hydrometeorological hazards through five pillars: (1) establishing cooperative observation networks; (2) advancing process-based understanding of Earth system dynamics; (3) enhancing prediction and early warning capabilities; (4) fostering indigenous and local knowledge and data sharing; and (5) strengthening capacity building among international partners.

This presentation will introduce the program's recent progress, including collaborative observations for understanding Earth system dynamics, the integration of a regional climate model with a coupled land surface-hydrology-ecology model that accounts for human activities (e.g., reservoir regulation, irrigation, urbanization), and the development of a forecasting framework. This framework connects the regional model with an AI model to predict droughts, floods, and compound events at synoptic to sub-seasonal scales.

Other activities under EarthRes will also be introduced, and future plans will be discussed. Through international collaboration and targeted capacity-building, EarthRes seeks to enhance sub-seasonal prediction and early warning capabilities, with particular benefits for vulnerable regions.

How to cite: Yuan, X., Sheffield, J., Pan, M., Kam, J., He, X., Roundy, J., Chaney, N., Wanders, N., Wang, L., Li, C., and Hao, Y.: An introduction to the EarthRes program, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11167, https://doi.org/10.5194/egusphere-egu26-11167, 2026.

EGU26-13594 | ECS | Posters on site | CL5.8

Classification and Attribution of Compound Flood Events  

Jinjie Zhao and Carlo De Michele

Floods are the most common natural hazards, and the compound effects of flood events pose severe challenges to flood protection. The lack of flood observation data makes it difficult to identify and analyze compound flood effects. Here, we employed a data-driven approach to reconstruct discharge in ungauged regions. We classified flood events from a compound perspective, quantified the contributions of different drivers, and compared the impacts of compound and non-compound flood events. Our results showed that pronounced compound effects were common in most flood events, with many compound flood events clustered in India and southeastern China. Compound events caused substantially greater impacts than non-compound events in Asia and North America.

How to cite: Zhao, J. and De Michele, C.: Classification and Attribution of Compound Flood Events , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13594, https://doi.org/10.5194/egusphere-egu26-13594, 2026.

EGU26-14172 | ECS | Posters on site | CL5.8

Benchmarking machine learning-based emulators and traditional methods to calibrate land model parameters for 124 global flux tower sites 

Ignacio Aguirre, Wouter Knoben, Nicolas Vasquez, and Martyn Clark

Accurately simulating latent and sensible heat fluxes is a long-standing open challenge in the land modeling community. The recent model intercomparison project PLUMBER 2 over 154 flux towers showed that simple 1-variable linear regression models can outperform process-based models in simulating latent and sensible heat. PLUMBER 2 simulations were run using default model parameters, leaving the potential performance gains from parameter estimation unquantified.

Identifying optimal parameters in land models has several challenges, including high computational cost and the need to identify parameters that can correctly reproduce temporal dynamics (i.e., good performance across different time epochs) and spatial patterns (i.e., good performance across many sites). To evaluate the ability of different calibration methods to handle these challenges, this study compared the performance of traditional and machine-learning emulator-based calibration methods against Long Short-Term Memory (LSTM) benchmarks, with single-objective experiments (latent heat or sensible heat calibrated individually) and multi-objective experiments (latent and sensible heat calibrated simultaneously). We also tested two ways to train emulators and LSTMs: either considering one site at a time or leveraging information from multiple sites and their attributes simultaneously.

Our results show that the calibrated simulations outperformed the default parameters and the simple benchmarks used in PLUMBER 2, demonstrating the potential to improve process-based models. Moreover, we observed that traditional calibration methods have a tendency to overfit: these traditional calibration methods can achieve high performance during calibration but are unable to achieve similar results during validation. The emulator-based methods achieve more consistent results across both calibration and validation time periods. Additionally, we found that parameter estimation methods that incorporate information from multiple sites simultaneously achieve better spatial consistency than methods that only learn from one site at a time. These results suggest that the performance gap between LSTM and process-based models can be significantly narrowed through calibration.

 

How to cite: Aguirre, I., Knoben, W., Vasquez, N., and Clark, M.: Benchmarking machine learning-based emulators and traditional methods to calibrate land model parameters for 124 global flux tower sites, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14172, https://doi.org/10.5194/egusphere-egu26-14172, 2026.

Land hydrology is a fundamental part of the global water cycle, and as such, of Earth’s climate system, including the biosphere. Yet, this basic component is still poorly represented in current models, partly because the structure of the land features scales much smaller than what those models can resolve, but also due to a lack of understanding of processes occurring below ground that are not readily at sight. Here we will examine from the perspective of what is important to the atmosphere from seasonal to centennial timescales, questions such as what groundwater and surface water do in shaping water availability and how vegetation and ecosystems adapt to it, ultimately modulating land-surface fluxes and climate. How relevant are these processes and what are we missing in current land-surface models? 

How to cite: Miguez-Macho, G. and Fan, Y.: Land hydrology, water availability for ecosystems and land surface models: what are we missing? , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15491, https://doi.org/10.5194/egusphere-egu26-15491, 2026.

Human interactions with the water cycle are increasingly recognised as critical drivers of land-climate feedbacks, yet they have long been under-represented in climate modelling.  With ongoing climate change, water management strategies and irrigation practices are becoming more important across many parts of the world. Since these activities can significantly alter surface energy and water fluxes, and thus local and regional climate, it is important to study these processes in more detail.

Although some Earth system models and regional climate models have started to incorporate irrigation routines, they still lack a representation of water availability from different sources and the competing demands of other sectors. To address this gap, we are developing the flexible water modelling tool C-CWatM that can be easily coupled with existing (regional) climate models. Based on the socio-hydrological model CWatM, it simulates river discharge, groundwater, reservoirs and lakes, as well as water demand and consumption from industry, households and agriculture.

In this contribution, we present initial results from coupled simulations using C-CWatM and the regional climate model REMO to study the impact of large-scale irrigation on regional climate conditions. The coupling is implemented via the OASIS3-MCT coupler, which manages synchronised data exchange and regridding of coupling fields. REMO provides the forcing fields required by C-CWatM and receives irrigation water amounts from C-CWatM, which are then applied within REMO's irrigation scheme. 

The development and coupling of C-CWatM allows climate models to realistically account for irrigation constraints, which is particularly important in water-scarce regions and under the increasing risk of droughts driven by climate change. Thus, our approach is an important step towards next-generation land surface modelling and promotes collaboration between hydrology and climate modelling communities to advance understanding of land-climate feedbacks and inform future adaptation strategies.

How to cite: Schmitt, A. and Greve, P.: Irrigation–climate feedbacks in coupled climate simulations: First results using an integrated hydrological modelling tool, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17003, https://doi.org/10.5194/egusphere-egu26-17003, 2026.

EGU26-17882 | ECS | Posters on site | CL5.8 | Highlight

Rapid Forecasting Method for Flood Process by Using on Physically Based Numerical and AI Model 

Xinxin Pan and Jingming Hou

With the acceleration of urbanization, complex underlying surfaces, pipe networks, river channels, and hydraulic facilities (gates, sluices, pumps) have significantly increased the number of computational grids and physical processes, making the computational efficiency of physical rainfall-runoff models insufficient to meet the timeliness requirements of emergency management for flood disasters. This necessitates further research on new technologies to enhance the computational efficiency of flood simulation and forecasting models. The development of AI technology provides new approaches for rapid flood disaster simulation and forecasting. This study proposes three innovative methods to address these challenges. First, GPU Accelerated Model for Surface Water Flow and Associated Transport. Second, AI Based Rapid Predicting Method for Flood Process. Third, Model Application for Dam Break Flood Simulation. 

How to cite: Pan, X. and Hou, J.: Rapid Forecasting Method for Flood Process by Using on Physically Based Numerical and AI Model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17882, https://doi.org/10.5194/egusphere-egu26-17882, 2026.

EGU26-18864 | ECS | Posters on site | CL5.8

Global amplification of water whiplash revealed by terrestrial water storage 

Yuheng Yang and Ruiying Zhao

Hydroclimate volatility, characterized by abrupt transitions between dry and wet extremes, poses a growing threat to global water security. Yet, current understanding of these transitions largely relies on meteorological metrics, which often fail to capture the full complexity of hydrological processes, land surface memory, and human water management. Here, we present a global assessment of water whiplash through the lens of terrestrial water storage (TWS). By integrating hydrological modeling with data-driven approaches, we reconstructed a comprehensive long-term TWS dataset to identify these events and account for delayed hydrological responses. Our results reveal a widespread intensification of global water whiplash in recent decades, with a substantial further increase projected under high-warming scenarios. Attribution analysis indicates that while climate change acts as the dominant driver of this amplification, human water management plays a critical role in spatially modulating these events, capable of either significantly mitigating or exacerbating local volatilities. We identify key hotspots of intensification in the tropics and high latitudes, encompassing extensive agricultural regions and major river basins. These findings establish TWS as a vital integrative indicator for monitoring abrupt hydrological transitions and underscore the urgent need for adaptive water management strategies to navigate an increasingly volatile hydroclimate.

How to cite: Yang, Y. and Zhao, R.: Global amplification of water whiplash revealed by terrestrial water storage, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18864, https://doi.org/10.5194/egusphere-egu26-18864, 2026.

EGU26-19214 | ECS | Orals | CL5.8

Introducing Groundwater Dynamics into the ECLand Land Surface Model: Implementation and Effects 

Vincenzo Senigalliesi, Andrea Alessandri, Stefan Kollet, and Simone Gelsinari

Land surface models still lack a realistic representation of groundwater, often relying on a free drainage condition at the bottom of the unsaturated soil column as in the current version of ECLand. This unrealistic assumption places the groundwater infinite depth below the surface, thus limiting the model’s ability to simulate realistic soil–vegetation-groundwater interaction.

To address this limitation, we implemented a Dirichlet boundary condition at the bottom of the unsaturated soil to enable a fully implicit numerical scheme for coupling with groundwater. First, we prescribed the water table depth (WTD) using global scale estimates to allow for the computation of realistic water fluxes between the unsaturated zone and the underlying aquifer. In a second step,  a dynamic WTD (hereafter the DYN configuration) was  developed by defining the water stored in the  unconfined aquifer, which evolves prognostically according to drainage (groundwater recharge) and subsurface runoff (groundwater discharge).

The effects of these developments were preliminarily evaluated through offline land-only simulations forced by station data from the PLUMBER2 project, which includes observational networks such as FLUXNET2015, La Thuile, and OzFlux. We validated the DYN configuration against the model setup with free-drainage conditions (CTRL). Our results show a systematic improvement in both latent and sensible heat fluxes, as quantified by the reductions in the error metrics  across most stations, with runoff scoring the best performances. 

The results of the global simulations largely corroborate and expand upon those of the station-based evaluation experiments conducted using PLUMBER2. The DYN configuration provides a more accurate representation of WTD, both spatially and temporally. This is evident in global climatological maps and independent observational datasets. Additionally, latent and sensible heat fluxes are consistently better represented in DYN than in CTRL, showing closer agreement with DOLCE and GLEAM products. Improvements are also evident in runoff simulations, with DYN exhibiting greater consistency with GLOFAS observations. Model performance was further evaluated against multiple observational datasets, such as GRACE/GRACE-FO to verify temporal variability in total water storage and to assess long-term mean conditions.

This work demonstrates that incorporating  groundwater dynamics significantly improves the realism of land-surface processes, particularly in the representation of the flux exchange of water and energy with other components. These results provide a foundation for the enhancement of the representation of land-climate interactions and hydroclimatological behaviour in next generation of reanalysis and climate predictions.

How to cite: Senigalliesi, V., Alessandri, A., Kollet, S., and Gelsinari, S.: Introducing Groundwater Dynamics into the ECLand Land Surface Model: Implementation and Effects, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19214, https://doi.org/10.5194/egusphere-egu26-19214, 2026.

EGU26-19820 | Posters on site | CL5.8

 Surface Soil Moisture–Vegetation Feedbacks in Water-Limited Regions across Land Surface Models 

Andrea Alessandri, Marco Possega, Annalisa Cherchi, Emanuele Di Carlo, Souhail Boussetta, Gianpaolo Balsamo, Constantin Ardilouze, Gildas Dayon, Franco Catalano, Simone Gelsinari, Christian Massari, and Fransje van Oorschot

Soil moisture plays a critical role in water-limited regions through its strong coupling and feedbacks with vegetation. However, state-of-the-art Land Surface Models (LSMs) used in reanalysis and near-term prediction systems still lack a realistic coupling of vegetation, limiting their ability to properly account for the fundamental role of vegetation in modulating the feedback with soil–moisture.
In this study, we incorporate Leaf Area Index (LAI) variability from observations - derived from the latest-generation satellite products provided by the Copernicus Land Monitoring Service - into three different LSMs. The models perform a coordinated set of offline, land-only simulations forced by hourly atmospheric fields from the ERA5 reanalysis. An experiment using interannually varying LAI (SENS) is compared with a control simulation based on climatological LAI (CTRL) in order to quantify vegetation feedbacks and their impact on simulated near-surface soil moisture.
Our results show that interannually varying LAI substantially affects near-surface soil moisture anomalies across all three models and over the same water-limited regions. However, the response differs markedly among models. Compared with ESA-CCI observations, near-surface soil moisture anomalies significantly improve in one model (HTESSEL–LPJ-GUESS), whereas the other two models (ECLand and ISBA–CTRIP) exhibit a significant degradation in anomaly correlation. The improved performance in HTESSEL–LPJ-GUESS is attributed to the activation of a positive soil moisture–vegetation feedback enabled by its effective vegetation cover (EVC) parameterization. In HTESSEL–LPJ-GUESS, EVC varies dynamically with LAI following an exponential relationship constrained by satellite observations. Enhanced (reduced) soil moisture limitation during dry (wet) periods leads to negative (positive) LAI and EVC anomalies, which in turn generate a dominant positive feedback on near-surface soil moisture by increasing (decreasing) bare-soil exposure to direct evaporation from the surface. In contrast, ECLand and ISBA–CTRIP prescribe EVC as a fixed parameter that does not respond to LAI variability, preventing the activation of this positive feedback. In these models, the only active feedback on near-surface soil moisture anomalies is negative and arises from reduced (enhanced) transpiration associated with negative (positive) LAI anomalies.
Our findings demonstrate that simply prescribing observed vegetation properties in LSMs does not guarantee a realistic coupling between vegetation and soil moisture. Instead, it is shown that the explicit representation of the underlying vegetation processes is essential to activate the proper feedback and capture the correct soil moisture response.

How to cite: Alessandri, A., Possega, M., Cherchi, A., Di Carlo, E., Boussetta, S., Balsamo, G., Ardilouze, C., Dayon, G., Catalano, F., Gelsinari, S., Massari, C., and van Oorschot, F.:  Surface Soil Moisture–Vegetation Feedbacks in Water-Limited Regions across Land Surface Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19820, https://doi.org/10.5194/egusphere-egu26-19820, 2026.

The plant litter layer, a critical interface between the atmosphere and soil, regulates energy, water, and carbon exchanges, yet its thermal insulation effects are poorly represented in Earth System Models (ESMs). This omission hampers our ability to accurately simulate the climate-hydrology-ecosystem nexus, particularly in cold regions where soil thermal regimes control freeze-thaw processes, hydrology, and biogeochemical cycles. To address this gap, we integrated a dynamic litter layer with explicit thermal properties into the Noah-MP land surface model. Validation against global flux tower sites confirms significant improvements in simulating soil temperature and moisture.
Our results reveal that litter insulation creates a strong seasonal asymmetry in soil temperatures, inducing a net annual cooling (up to –0.69 °C) by providing stronger summer cooling than winter warming. Furthermore, it fundamentally alters soil freeze-thaw processes (FTP), but with divergent impacts: it delays the freezing end date in permafrost regions while advancing it in seasonally frozen ground, with shifts up to 40 days. The strongest modulation of freezing duration (~100 days) occurs in regions with a mean annual temperature near 10°C. We identify six distinct FTP response modes, controlled by the non-linear interplay between climate, litter thickness, and snow depth. The altered thermal and hydrological states feedback to ecosystem processes, offsetting the greening-driven gains in gross primary productivity by 20.57 ± 3.65 g C m⁻² yr⁻¹ while enhancing forest soil organic carbon stocks by 2.08 ± 0.24 kg C m⁻².
These findings demonstrate that the litter layer is a key biogeophysical mediator, directly coupling vegetation dynamics with soil thermal-hydrological states. Explicitly representing this process in ESMs is therefore essential for advancing the simulation of the carbon-water-energy nexus, improving projections of permafrost thaw, ecosystem feedbacks, and hydrological changes under vegetation greening and climate warming.

How to cite: Huang, P., Wang, G., and Valentini, R.: Representing Plant Litter Insulation in Land Surface Models: A Critical Process for Simulating the Soil Thermal-Hydrological-Ecological Nexus in Cold Regions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22297, https://doi.org/10.5194/egusphere-egu26-22297, 2026.

EGU26-107 | ECS | Orals | HS3.6

Machine Learning Integration Strategies for Process-based Ecohydrological Modeling: Addressing Epistemic Uncertainties of Water Mixing Dynamics in Tree Water 

Hyekyeng Jung, Chris Soulsby, Songjun Wu, Christian Birkel, and Dörthe Tetzlaff

Compared to process-based models (PBMs), higher prediction accuracy of machine learning models (MLMs) has been repeatedly reported in ecohydrological research. This might indicate the higher efficiency of data-driven MLMs for extracting and generalising information from the data, especially when traditional PBMs are often challenged by epistemic uncertainties in process representation. To preserve ‘modelling as a learning tool’, integrating MLMs into PBMs is a promising avenue to leverage MLMs for data assimilation, and PBMs for holistic explainability of processes across the Critical Zone (i.e., the thin crust of the Earth including vegetation).
One example of an ecohydrological process with high epistemic uncertainties is the mixing mechanism of root uptake water from soils by trees. Due to limited process understanding together with high uncertainties of isotope measurements in trees, usually mixing dynamics in tree water storage in ecohydrological models show poor representation.
Here, we use data from a comprehensive monitoring campaign which has been conducted during the growing season of 2020 at a plot site with two willow trees and grass in southeastern Berlin, Germany, including daily or sub-daily in-situ measurements of hydrological characteristics and stable water isotopes in precipitation, soils, vegetation, and neighboring open water bodies. Using the data, a baseline ecohydrological PBM (EcoHydroPlot) was used to simulate water flow and isotope dynamics across the Critical Zone. In addition, MLMs with different strategies for integration were applied: Firstly, as an additional module to the PBM, a post-hoc result-analyzing MLM was trained with the error of the PBM. Secondly, a hybrid model was built that replaces equations for mixing mechanism of root-uptake water in PBM with a data-driven ML algorithm. An eXplainable AI (XAI) tool was applied to help understand uncertainties in the PBM and process representation in MLM.
By comparing these approaches using different criteria of prediction accuracy and interpretability, we identified an optimal strategy for leveraging MLM capabilities within PBM frameworks in addressing the process of tree water mixing with high epistemic uncertainties, potentially extending the concept of ‘modeling as a learning tool’ to MLM-integrated PBMs.

How to cite: Jung, H., Soulsby, C., Wu, S., Birkel, C., and Tetzlaff, D.: Machine Learning Integration Strategies for Process-based Ecohydrological Modeling: Addressing Epistemic Uncertainties of Water Mixing Dynamics in Tree Water, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-107, https://doi.org/10.5194/egusphere-egu26-107, 2026.

EGU26-928 | ECS | Posters on site | HS3.6

Sensitivity of machine-learning crop-type mapping to feature selection and hyper-parameter tuning. 

Mayra Perez, Frédéric Satgé, Jorge Molina, Renaud Hostache, Ramiro Pillco, Elvis Uscamayta, Diego Tola, Lautaro Bustillos, and Celine Duwig

To improve crop yields and economic incomes, farmers consistently adapt their practices to climate and market fluctuations, resulting in highly variable crop field distribution and coverage in space and time. As these dynamics ilustrate, up-to-date crop-type mapping is essential to understand farmers’ needs and supporting them in adopting sustainable practices. With global coverage and frequent temporal observations, remote sensing data are generally integrated in machine learning models to monitor crop-type mapping dynamics. Unlike physical-based models that rely on straightforward use, the implementation of machine-learning approaches depends on deep interaction with users. In this context, the study assesses the output sensitivity of these models to features selection and hyper-parameter calibration, both of wich rely on user consideration. To do so, Sentinel-1 (S1) and Sentinel-2 (S2) features are integrated into five distinct models (RF, SVM, LGB, HGB, XGB), considering different features selection (VIF and SFS) and hyper-parameter calibration set-up. Results show that pre-process modeling VIF feature selection discards features that wrapped SFS feature selection keeps, resulting in less reliable crop-type mapping compared to using SFS. Additionally, hyper-parameter calibration appears to be sensitive to the input feature and its consideration after any the feature selection improved the crop-type mapping. In this context a three-step nested modelling set-up including a first hyper-parameters calibration followed by a wrapped feature selection (SFS) and another hyper-parameter calibration, lead to the most reliable model outputs. Across the considered region, LGB and XGB (SVM) are the most (less) suitable model for crop-type mapping and models reliability improved when integrated S1 and S2 features rather than the consideration of S1 or S2 alone. Finally, crop-type maps are derived across different regions and periods to highlight the benefits of the proposed method to monitor crops’ dynamics in space and time.

How to cite: Perez, M., Satgé, F., Molina, J., Hostache, R., Pillco, R., Uscamayta, E., Tola, D., Bustillos, L., and Duwig, C.: Sensitivity of machine-learning crop-type mapping to feature selection and hyper-parameter tuning., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-928, https://doi.org/10.5194/egusphere-egu26-928, 2026.

Hydrological modelling is essential for water resource management, decision making, extreme events forecasting, and for advancing an integrated understanding of the water cycle. In this context, two main approaches dominate: physics-based (or process-based) models, which simulate hydrological processes such as streamflow using fundamental physics equations, and data-driven models, which use statistical or machine learning techniques to map inputs to outputs. Although Artificial Intelligence (AI) techniques have shown promising results in predictive accuracy, particularly in data-rich basins, their inherently black-box nature raises concerns about whether their internal representations align with real hydrological processes. This is especially critical when models are applied to extreme events, non-stationary conditions, or scenarios beyond the training distribution, where high performance metrics alone may not guarantee reliable or physically meaningful predictions. In this study, we evaluated the performance of a Long Short-Term Memory (LSTM) model for drought modelling modeling and assessed how effectively it could represent real-world hydrological behavior in the Rio Grande do Sul watersheds available in the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS-BR) dataset. The focus on these basins is particularly relevant given the region's hydrological importance, susceptibility to extreme events (e.g., droughts and floods), and distinct characteristics compared to temperate regions, where most legacy models were developed. The model was trained using data from 55 different basins across the state. This multi-basin approach allows the LSTM to learn universal hydrological patterns while maintaining the ability to predict low flow conditions in individual watersheds. The model inputs combined dynamic hydrological variables (e.g., precipitation and evapotranspiration) with static catchment attributes  (e.g., aridity, soil properties, and topography). Accumulated rainfall features were constructed over 3-30 day windows to capture watershed memory effects as a proxy to soil moisture dynamics. In addition, Explainable AI (XAI) techniques together with hydrological signatures (e.g. runoff ratio, baseflow index and elasticity) were applied to assess the physical soundness of the LSTM model in the region. Following this, the internal structure of the LSTM - particularly the cell states - were analyzed and compared with hydrological behavior (e.g., soil water accumulation, groundwater dynamics, rainfall inputs) in both situations where XAI and hydrological signatures highlighted, or did not highlight, physical consistency. The LSTM’s effectiveness in Brazilian watersheds highlighted its potential as a complementary tool for low flow and drought modelling, offering a valuable alternative for water resources management. XAI analyses and hydrological signatures highlighted the physical soundness of the multi-basin model, but also indicated that improvements were needed, as the internal structure did not consistently track physical hydrological behavior in some cases, hindering the extrapolation of the LSTM model to assess drought conditions in different meteorological settings (e.g., climate change scenarios).

How to cite: Canellas, E., Perdigão, R., Brentan, B., and Rodrigues, A.: Beyond Accuracy: Trustworthy LSTM-Based Hydrological Modelling Assessed with XAI and Hydrological Signatures — A Case Study in Rio Grande do Sul, Brazil, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1116, https://doi.org/10.5194/egusphere-egu26-1116, 2026.

EGU26-1275 | ECS | Posters on site | HS3.6

Diffusion-Based Physics-Aware Modeling of Subsurface Soil Moisture 

Vidhi Singh, Abhilash Singh, and Kumar Gaurav

Accurate characterization of soil moisture at subsurface depths is essential for hydrological modeling, agricultural management, and climate risk assessment. However, in-situ subsurface measurements remain sparse and often discontinuous due to logistical and operational constraints, especially in data-limited regions. This creates a pressing need for approaches that can reliably infer deeper soil moisture states from surface observations, which are more readily available from both remote sensing platforms and ground-based sensors. This study proposes a probabilistic, physics-aware denoising diffusion model designed to estimate soil moisture at subsurface depths using only surface moisture measurements. The model integrates smoothness and curvature regularization terms inspired by Fickian diffusion theory as weak physics to guide the learning process, without requiring explicit or site-specific physical parameters, thereby enhancing its practicality and ensuring broader applicability across diverse hydroclimatic conditions. The model is trained and evaluated across 20 global ISMN (International Soil Moisture Network) sites at 10, 20 and 40 cm depths with hourly observations spanning six distinct Köppen–Geiger climate classes and four high-resolution African stations with 10-min data.

Across global stations, the model demonstrated consistently high predictive skill (R² ranging from 0.91 to 0.99) with lower errors in climates characterized by stable seasonal patterns, and comparatively higher uncertainty in regions affected by freeze-thaw dynamics or monsoonal variability. Benchmarking against 17 state-of-the-art algorithms using Dolan–Moré profiles showed strong and reliable performance across depths and metrics. A stochastic robustness analysis with 30 random seeds and varying ensemble sizes indicated that moderate-sized ensembles provide an effective balance between accuracy and stability. Sensitivity experiments with white, autocorrelated, and structured noise revealed that the 20 cm layer is most susceptible to surface-level perturbations, while deeper layer remain comparatively resilient. The model also highlighted a strong performance on higher-resolution datasets, with prediction errors tightly centered around zero and exhibiting very low standard deviation. The generalisation of the proposed diffusion-based model across spatial, temporal, and climatic variability highlights its potential as a lightweight and transferable alternative for hydrological forecasting in data-scarce or operationally constrained environments.

How to cite: Singh, V., Singh, A., and Gaurav, K.: Diffusion-Based Physics-Aware Modeling of Subsurface Soil Moisture, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1275, https://doi.org/10.5194/egusphere-egu26-1275, 2026.

The Yellow River Basin (YRB) is among the most water-scarce, sediment-laden, and anthropogenically impacted river basins worldwide. Rainfall–runoff and runoff–sediment relationships in the YRB have traditionally been investigated using process-based hydrological models, which are computationally demanding and difficult to apply at large spatial scales. Here, a physics-guided LSTM–GNN (Long Short-Term Memory and Graph Neural Network) framework was proposed to simulate coupled water–sediment processes across the YRB. Using sub-basin delineation and upstream–downstream connectivity derived from the physically based Geomorphology-Based Ecohydrological Model (GBEHM), the framework employs LSTM to learn local runoff and sediment generation within individual sub-basins, and GNN to represent topology-constrained routing along the river network. The coupled model generated monthly streamflow and sediment data for 718 sub-basins over the period 1982–2017. Compared with a baseline model that neglects physical river-network topology (total NSEflow=0.78, NSEsediment=0.62; median NSEflow=0.09, NSEsediment=0.13), the proposed framework demonstrated significantly improved predictive performance (total NSEflow=0.89, NSEsediment=0.85; median NSEflow=0.42, NSEsediment=0.32) during the test period (2013–2017), especially at stations in large tributaries and the main stream, with high connectivity and large catchment areas. These results show that the proposed LSTM-GNN framework can effectively serve as a surrogate of the process-based model with high accuracy, highlighting its potential for simulating upstream–downstream coupled hydrological processes in super-large river basins.

How to cite: Li, S., Yang, H., Wang, T., and Yang, D.: Coupled Water–Sediment Modelling in the Yellow River Basin Using a Physics-Guided LSTM–GNN Framework Incorporating River Network Topology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2784, https://doi.org/10.5194/egusphere-egu26-2784, 2026.

Vegetation mapping is a key step in wetland monitoring, management, and conservation. Remote sensing image classification offers an excellent solution for vegetation mapping due to its high temporal and spatial resolution. In spite of these advantages, remote sensing classification of wetland vegetation is usually limited to a small number of target classes and lack explanation of the input features importance. To address this limitation, this study presents a detailed wetland vegetation classification, which is followed by an explainability study.

The study was conducted in the Biebrza wetlands located in NE Poland, covering approximately 220km2. These wetlands are situated around the Biebrza River, which floods yearly, producing a characteristic vegetation zonation. The training and validation data for vegetation classification was a vegetation survey conducted in 2015 and kindly provided by the Biebrza National Park.

The input features for classification was obtained from fusing VIS-IR data from Sentinel-2, thermal data from Landsat-8, and Synthetic Aperture Radar (SAR) data from Sentinel-1. The Sentinel-2 data consisted of four images (one image per season), each with eleven bands. The Landsat-8 data also comprised four images, with one thermal band per image. The Sentinel-1 data included 24 dual-polarization (VV+VH) images (one image per month, varied by ascending and descending orbit). All image data were acquired within the 2014-2017 period and resampled to 10-meter spatial resolution.

The "ranger" Random Forest implementation in R was used as the classifier. The classifier was trained on a stratified random 50% of the vegetation data points and validated on the remaining 50%. The built-in permutation feature importance algorithm was used to indicate the most important bands for the classification.

The classification-based vegetation map highly reflected the characteristic vegetation zonation of the Biebrza wetlands. The overall accuracy was 0.994 and the Kappa index was 0.993. The most important band for the classification was the Landsat-8 thermal image from the winter season. However, the thermal bands from the remaining seasons were relatively unimportant. The next most important bands were the Sentinel-2 VIS-IR images from the spring and fall seasons, particularly the red, red-edge, and SWIR bands. The SAR data from Sentinel-1 were the least important of all data used; the most important Sentinel-1 band (19th position) was VH from September, descending orbit.

How to cite: Berezowski, T.: Explainable machine learning for detailed wetland vegetation classification using remote sensing data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5240, https://doi.org/10.5194/egusphere-egu26-5240, 2026.

EGU26-5408 | Orals | HS3.6

Global estimation of the median annual maximum flood (QMED) using explainable machine learning  

Valeriya Filipova, David Leedal, and Sam Clayton

Reliable estimation of the median annual maximum flood (QMED) is central to flood risk assessment and the design of hydraulic infrastructure, particularly in ungauged basins. Traditional index-flood approaches typically delineate homogeneous regions and estimate QMED using linear regression on a small set of catchment descriptors. However, these assumptions are often violated in practice, leading to substantial prediction uncertainty. 

Here, we explore the potential of explainable machine-learning models to estimate QMED at large scale. Using data from approximately 8,500 catchments and more than 60 climatic, physiographic, and geomorphological descriptors, we train non-linear models (XGBoost and TabNet) to predict QMED for ungauged basins. To promote physically plausible behaviour, model training incorporates constraints on specific discharge alongside standard performance metrics. A key feature of the approach is the extensive use of DEM-derived terrain and river-network descriptors, which can be computed consistently from widely available global elevation datasets. 

Model interpretability is addressed using global and local explainability techniques, enabling identification of the dominant controls on QMED and how their importance varies spatially. Across independent test data, the models show strong predictive skill (R² > 0.8, median absolute percentage error ~30%). Notably, in many regions models trained on large, globally diverse datasets outperform those trained solely on local data, even where substantial local records are available. 

These results indicate that combining globally consistent physiographic information with interpretable, non-linear machine-learning models offers a promising alternative to traditional regional regression methods for QMED estimation, with potential benefits for flood risk assessment in data-sparse regions. 

How to cite: Filipova, V., Leedal, D., and Clayton, S.: Global estimation of the median annual maximum flood (QMED) using explainable machine learning , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5408, https://doi.org/10.5194/egusphere-egu26-5408, 2026.

EGU26-5540 | ECS | Posters on site | HS3.6

A Deep Ensemble Learning Framework with Interpretability for Long-Term Streamflow Forecasting under Multiple Uncertainties 

Xinyuan Qian, Ping-an Zhong, Bin Wang, Yu Han, Yukun Fan, Yiwen Wang, Sunyu Xu, Zixin Song, and Mengxue Ben

Accurate and reliable long-term streamflow forecasting plays a crucial role in sustainable water resource management and risk mitigation. However, forecast performance is often constrained by multiple sources of uncertainty and the limited interpretability of deep learning models. To address these challenges, this study proposes an explainable hierarchical optimisation framework for long-term streamflow forecasting based on ensemble learning. The proposed framework systematically integrates a Dempster–Shafer (DS) evidence theory-based predictor selection strategy to reduce input uncertainty, an improved loss function designed to enhance model sensitivity to extreme flow events, and a Stacking ensemble scheme that combines the complementary strengths of multiple deep learning models, thereby overcoming the limitations of individual models in complex hydrological systems. In addition, SHapley Additive exPlanations (SHAP) are employed to improve model interpretability and to quantify the contributions of different predictors.

The effectiveness of the proposed framework is demonstrated through long-term streamflow forecasting at Hongze Lake. The results indicate that: (1) the DS-based predictor selection method substantially enhances both forecasting accuracy and stability, with Nash–Sutcliffe efficiency (NSE) values increasing by 0.10–0.18; (2) the improved loss function significantly strengthens model robustness under extreme high-flow conditions, reducing the mean absolute percentage error (MAPE) by 63.11%, 55.33%, and 23.6% for the MLP, LSTM, and Transformer models, respectively; (3) the Stacking ensemble model consistently outperforms individual base models by reducing forecast errors (RMSE decreased by 17–25%), improving the representation of large-scale variability (MAPE reduced by 21.6–26.8%), and more accurately capturing streamflow dynamics (NSE increased by 0.12–0.20), effectively mitigating multi-source uncertainties; and (4) SHAP-based interpretability analysis reveals pronounced monthly variations in predictor importance and confirms the dominant influence of antecedent streamflow on long-term forecasts. Overall, the proposed framework markedly improves the accuracy, robustness, and transparency of long-term streamflow forecasting and shows strong potential for application in other data-driven hydrological forecasting tasks.

How to cite: Qian, X., Zhong, P., Wang, B., Han, Y., Fan, Y., Wang, Y., Xu, S., Song, Z., and Ben, M.: A Deep Ensemble Learning Framework with Interpretability for Long-Term Streamflow Forecasting under Multiple Uncertainties, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5540, https://doi.org/10.5194/egusphere-egu26-5540, 2026.

EGU26-5830 | ECS | Posters on site | HS3.6

Global groundwater recharge estimation through hybrid modeling 

Jiaxin Xie, Zavud Baghirov, Markus Reichstein, and Martin Jung

Groundwater provides drinking water for billions and supports nearly half of irrigated agriculture, yet global renewable groundwater availability—quantified as groundwater recharge—remain highly uncertain. Here, we simulate global groundwater recharge using a hybrid model that seamlessly integrates machine learning with physical processes. The hybrid model substitutes machine learning for poorly represented hydrological processes while retaining established physical equations, such as water balance. By leveraging diverse Earth system observations—including streamflow-derived groundwater discharge, satellite-retrieved terrestrial water storage anomalies, and flux tower evapotranspiration—the hybrid model effectively integrates process knowledge with multi-source data constraints to improve the accuracy of global groundwater recharge simulations. Such integration may also deepen our process understanding of groundwater recharge.

How to cite: Xie, J., Baghirov, Z., Reichstein, M., and Jung, M.: Global groundwater recharge estimation through hybrid modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5830, https://doi.org/10.5194/egusphere-egu26-5830, 2026.

Soil moisture is a fundamental hydrological variable that governs groundwater recharge and agricultural productivity. Accurate long-term forecasting is essential for water resource management, yet it remains challenging due to significant observational noise in sensor data and the error propagation inherent in traditional deep learning models. While physics-based models struggle with site-specific calibration and Neural Ordinary Differential Equations (Neural ODEs) often fail to recover stable continuous dynamics from noisy, discretely sampled signals, there is a clear need for a more robust forecasting framework.

In this work, we propose EulerNet, a pragmatic discrete-time framework designed for high-fidelity soil moisture prediction. Instead of attempting to reconstruct complex latent continuous-time vector fields, EulerNet explicitly models the fixed-step mapping required for operational forecasting. The architecture integrates an Euler-style residual update to parameterize one-step tendencies, ensuring numerical stability through its incremental integration form. To mitigate the impact of sensor noise, we incorporate a Random Synthesizer feature mixer. By employing input-independent alignment matrices rather than dynamic self-attention, the Random Synthesizer acts as an implicit regularizer, preventing the model from overfitting to spurious, noise-induced correlations.

We evaluated EulerNet using high-noise in-situ observations. In a one-month autoregressive rollout, the model achieved exceptional performance with R2 = 0.7977, RMSE = 0.0039, and RMAE = 0.0083. These results demonstrate that for fixed-step environmental forecasting, a specialized discrete-time formulation can effectively bypass the complexities of continuous-time modeling while maintaining high stability and accuracy under significant noise. Our findings provide a practical and efficient alternative for modeling complex Earth system dynamics from real-world observational data.

How to cite: Kang, W.: EulerNet: A Robust Discrete-Time Framework for Long-Term Soil Moisture Forecasting Under Significant Observational Noise , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8707, https://doi.org/10.5194/egusphere-egu26-8707, 2026.

Accurate rainfall-runoff analysis is vital for flood prediction, water resources management, and climate impact assessment. While data-driven hydrological models such as Long Short-Term Memory (LSTM) networks have shown promise, developing a globally applicable framework that is accurate, interpretable, and computationally efficient remains a grand challenge, primarily because most catchments worldwide are ungauged. We address this by employing HYdrologic Prediction with multi-model Ensemble and Reservoir computing (HYPER). This hybrid method combines Bayesian Model Averaging (BMA), a multi-model ensemble, with Reservoir Computing (RC), a type of machine learning model. The framework infers model weights for ungauged basins by linking catchment attributes to the model weights learned from gauged basins. While this model has previously demonstrated higher accuracy and lower uncertainty compared to LSTMs, particularly when training data is limited, its global applicability remains unassessed. Therefore, in this study, we evaluate the global applicability of HYPER using a pseudo-ungauged approach, where gauged basins are treated as ungauged for validation. We challenge the conventional assumption that more data is better by investigating whether selecting a strategic subset of gauged basins for training outperforms using the entire available dataset. Initial experiments revealed that prediction accuracy remained robust regardless of whether 90 % or only 3 % of available basins were used for training. Furthermore, training on basins from a single, hydrologically similar region often yielded higher accuracy than training on a diverse multi-regional dataset. To identify the optimal training subset, we compared three distinct data selection methods: 1) Greedy selection, which identifies donor basins by selecting the nearest neighbors within the static catchment attribute state space; 2)  Physics-Informed selection, which calculates the distance between target and candidate basins while applying heavier penalty weights to slope and aridity to strictly enforce physical similarity; and 3) Meta-Learning, which utilizes a Random Forest to learn the relationship between attribute differences and model weight correlations, subsequently predicting donor basins expected to have the highest weight correlation with the target. While all three methods outperformed the baseline of using all available data (Kling-Gupta Efficiency (KGE): 0.12), the Physics-Informed and Meta-Learning approaches achieved the highest consistency and accuracy. Even when only 5 out of 1,505 basins were used for training, these methods achieved KGE scores of 0.26 and 0.31, respectively, effectively bridging the performance gap toward fully gauged basins (KGE: 0.54). These findings demonstrate that for global prediction in ungauged regions, data quality, especially the strategic selection of training basins, is more important than data quantity, marking a step towards robust, globally applicable runoff analysis.

How to cite: Funato, M. and Sawada, Y.: Data Quality over Quantity: Optimized Data Selection for Data-driven Global Prediction in Ungauged Basins, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9189, https://doi.org/10.5194/egusphere-egu26-9189, 2026.

EGU26-10856 | ECS | Orals | HS3.6

Calibration of a Long Short-Term Memory (LSTM) rainfall-runoff model using remote sensing soil water content estimations 

Tibor Rapai, Petra Baják, István Gábor Hatvani, András Lukács, and Balázs Székely

Long Short-Term Memory (LSTM) neural networks have proven their excellence in basin-level discharge prediction, provided there is an adequate amount of high-quality time series data available for training, including meteorological forcings and streamflow gauge measurements. Such data-driven black-box models can successfully learn the complex behavior of delayed hydraulic responses; however, they cannot yet be easily applied in water management practice, and model transfer attempts to ungauged catchments have not been entirely successful.

In our previous work, we explored an approach to characterizing near-surface flow regimes, starting from a full catchment model and then applying a single LSTM network layer within a semi-distributed subbasin setup reflecting downstream topography. Application to the Tarna River catchment area in Hungary (2,116 km2) showed that transfer learning from the full catchment model (achieving an NSE of 0.91 on the training set and 0.66 on an independent test set) to a downstream chain of gauged Hydrological Response Units (HRUs) is a powerful tool for investigating a semi-distributed HRU network. The entire setup, however, involves a much higher level of complexity, and the available detailed meteorological data and gauge measurements in only two-thirds of the subbasins did not provide sufficient information for the single LSTM model to fully predict the HRU network processes.

Because these models apply “virtual water amounts” stored in the hidden cells of the LSTM network for discharge estimation, their internal variables lack direct physical interpretability. In the present research, we investigate how data fusion during calibration with Gravity Recovery and Climate Experiment (GRACE) data, downscaled using soil water content and evapotranspiration products from the ECMWF Reanalysis (ERA5) database, can improve predictive performance, and help to verify our working hypothesis regarding the theoretical connection between Near Surface Water Content (NSWC) and LSTM cell states.

These results can also validate interpretations derived from our model concerning baseflow contributions and recharge-discharge classification of subbasins, while promising realistic transferability of the pre-trained lumped catchment model to all subbasins and broader general applicability of the proposed method. We hypothesize that the daily change dynamics of Terrestrial Water Storage (TWS) and NSWC – the latter playing a decisive role in gravitational flows within the Critical Zone – are strongly correlated.

Accordingly, we propose using downscaled TSW estimates to (1) introduce a new term into the loss function based on our working hypothesis relating median LSTM cell state values to the normalized dynamics of NSWC, and (2) add a new input dimension approximating total runoff as precipitation minus evapotranspiration and infiltration.

Furthermore, the current model extension, still based on 0.1 ° gridded input data, prepares the ground for future developments that incorporate high-spatial-resolution satellite remote sensing data, such as Sentinel-2 NDWI, to support local-scale hydrological applications efficiently. Integrating satellite data products with different temporal and spatial resolutions is not a straightforward calibration step for rainfall-runoff models, as pixel-wise normalization of measurements requires complex physically based geostatistical methods compatible with model logic to avoid performance deterioration.

 

How to cite: Rapai, T., Baják, P., Hatvani, I. G., Lukács, A., and Székely, B.: Calibration of a Long Short-Term Memory (LSTM) rainfall-runoff model using remote sensing soil water content estimations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10856, https://doi.org/10.5194/egusphere-egu26-10856, 2026.

EGU26-12259 | ECS | Posters on site | HS3.6

Global Rooting Depth Inferred based on Machine Learning 

Shekoofeh Haghdoost, Shujie Cheng, Oscar Baez-Villanueva, and Diego G. Miralles

Rooting depth Zr is a key variable controlling plant water uptake, soil–vegetation interactions, and land–atmosphere feedbacks. Despite its importance, global estimation of Zr remains challenging due to sparse in situ observations and strong spatial heterogeneity driven by climatic, edaphic, and vegetation controls. The interaction among these factors increases complexity, limiting the performance of traditional process-based models and leading to substantial uncertainty in large-scale applications. In this context, machine learning offers a data-driven alternative that can integrate heterogeneous datasets and capture nonlinear relationships and complex interactions among environmental variables, providing a flexible framework for improving large-scale estimates of rooting depth.

In this research, we investigate the environmental drivers of rooting depth at the global scale and develop a new spatially explicit Zr dataset using advanced machine learning methods. Our framework integrates multiple globally consistent datasets, including satellite-derived vegetation metrics (LAI, NDVI), land-surface temperature, and gridded climate variables (precipitation, radiation). These are complemented by soil hydraulic and physical attributes from global soil databases and detailed topographic information, providing a complete representation of environmental controls relevant to rooting depth. A Random Forest model is employed to capture the nonlinear relationships between the predictor set and observed rooting depths. Model interpretability is subsequently assessed using Shapley Additive exPlanations (SHAP), thereby quantifying the contribution of each environmental variable to model predictions.

The optimized model is subsequently applied at the global scale to generate a global Zr dataset using globally available plant, soil, and climate variables. By accounting for their combined effects, the model provides a spatially continuous representation of rooting depth across diverse regions. Model performance is evaluated using leave-one-out cross-validation (LOOCV), whereby each observation is iteratively excluded from the training dataset and used for independent validation. In addition, the resulting predictions are compared against existing global rooting depth datasets to evaluate large-scale consistency. The new Zr dataset enables improved drought monitoring capabilities through more realistic estimates of plant available water; it may enhance water resource assessments by refining infiltration and groundwater recharge estimates, and it helps reduce uncertainty in land surface and climate models by better representing soil-vegetation interactions. Overall, this work provides a robust data-driven approach for estimating Zr globally, independent of process-based assumptions, and relevant for diverse ecohydrological applications striving towards more accurate characterizations of terrestrial water and carbon cycling.

Keywords: rooting depth, machine learning, soil vegetation interactions, global hydrology, ecohydrology, Earth system modeling

How to cite: Haghdoost, S., Cheng, S., Baez-Villanueva, O., and G. Miralles, D.: Global Rooting Depth Inferred based on Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12259, https://doi.org/10.5194/egusphere-egu26-12259, 2026.

EGU26-12581 | ECS | Posters on site | HS3.6 | Highlight

Causal Analysis for Model Evaluation in Large Sample Hydrology 

David Strahl, Urmi Ninad, Sebastian Gnann, Karoline Wiesner, and Thorsten Wagener

Hydrological and land surface models rely on strong prior assumptions about system functioning, including which processes are represented, their parametrization and how they are simplified across space and time. Model evaluation, however, is often based on measures of predictive performance that provide limited insights into whether models capture underlying processes correctly. Causal discovery methods offer a complementary perspective by learning causal interaction networks directly from time series data to reveal how system components influence each other. Here, we apply the PCMCI+ algorithm for causal discovery in combination with a causal effect estimation to hydrometeorological observations and model simulations from 671 U.S. catchments to infer monthly causal interaction networks and associated effect strengths. We show that inferred interaction strengths vary systematically across gradients of water and energy availability and reflect structural differences in how three hydrological models represent key processes of snow and evapotranspiration dynamics. Our results illustrate how causal inference can complement traditional model evaluation approaches in complex environmental systems by providing process-level insights that help bridge theory, observations, and models across disciplines.

How to cite: Strahl, D., Ninad, U., Gnann, S., Wiesner, K., and Wagener, T.: Causal Analysis for Model Evaluation in Large Sample Hydrology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12581, https://doi.org/10.5194/egusphere-egu26-12581, 2026.

EGU26-17878 | Posters on site | HS3.6

Estimating the timing of the peak snowmelt floods in unregulated boreal catchments using machine learning techniques. 

Sadegh Kaboli, Ville Kankare, Cintia Bertacchi Uvo, Petteri Alho, Ali Torabi Haghighi, and Elina Kasvi

The timing of peak snowmelt floods in boreal environments has undergone significant changes, characterized by nonlinear and complex patterns. This timing determines when coastal areas of boreal rivers experience the greatest inundation during the spring season. It is highly sensitive to climate change and directly influences local fauna and flora. Despite its critical role in flood risk management, the prediction of spring flood timing, along with the identification of its key drivers and most influential factors, remains insufficiently studied in boreal regions.

In this study, we investigate the potential for predicting the timing of annual maximum snowmelt floods by applying a thermal definition of the spring season, along with various climatological and hydrological indices. The analysis is based on a comprehensive daily dataset available with varying record lengths of at least 50 years, available since the early 1960s and extending to 2023 across multiple unregulated Finnish catchments. Among the most important dynamic features are daily discharge records, high-resolution gridded temperature data, and atmospheric teleconnection indices. Additionally, key static catchment characteristics, such as area, slope, and geographical position, are also incorporated into the modeling process, along with other relevant variables.

Machine learning algorithms, including Random Forest and SHAP (SHapley Additive exPlanations) values for feature importance, are applied to identify the most influential factors shaping the timing of annual maximum snowmelt floods and to assess the overall predictability of these events across multiple catchments. The study introduces a novel approach using a thermal definition of spring. The findings provide new indices and actionable thresholds that can help identify areas where adaptation measures should be prioritized.

How to cite: Kaboli, S., Kankare, V., Bertacchi Uvo, C., Alho, P., Torabi Haghighi, A., and Kasvi, E.: Estimating the timing of the peak snowmelt floods in unregulated boreal catchments using machine learning techniques., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17878, https://doi.org/10.5194/egusphere-egu26-17878, 2026.

EGU26-17916 | ECS | Orals | HS3.6

Combining LSTMs with a Single-Model Large Ensemble for Runoff and Water Temperature Projections in Bavaria 

Alexander Sasse, Ralf Ludwig, Julius Weiß, and Kerstin Schütz

Both river runoff and river water temperature are experiencing highly dynamic alterations, posing serious threat to aquatic ecosystems and water resources management under climate change. Data-driven models such as Long Short-Term Memory (LSTM) networks have demonstrated remarkable skill in hydrological prediction, yet their application under non-stationary climate conditions remains challenging due to limited generalization to unseen catchments and conditions beyond the training distribution. We address these challenges by combining LSTM architectures with single-model initial condition large ensemble (SMILE) climate projections to assess non-stationary, non-linear hydrological responses considering the full range of internal climate variability and climate change, enabling robust assessment of rare and extreme events in Bavaria, Germany.

Our study builds on the ClimEx project, which provides a 50-member ensemble of climate simulations (1950–2099, RCP8.5 emission scenario) at 12 km resolution over Europe using the Canadian Regional Climate Model CRCM5.

We present two complementary application cases operating daily and at 3-hourly temporal resolution: i) For discharge prediction, we train an LSTM on observed runoff across 98 Bavarian catchments, validated against simulations from the process-based Water balance Simulation Model (WaSiM). The architecture processes dynamic meteorological forcings through stacked LSTM layers while incorporating static catchment attributes, using a composite loss function that balances performance across high and low flows. The trained model is then driven by the ClimEx ensemble to generate probabilistic discharge projections for future climate. ii) For water temperature (Tw) prediction, we developed an Entity-Aware LSTM (EA-LSTM) framework trained on observations from 44 Bavarian gauging stations, a subset of the 98 catchments constrained by Tw data availability, extended with nine French river basins to broaden the climatic gradient encountered during training. The EA-LSTM architecture explicitly separates static catchment attributes (elevation, slope, upstream river length) from dynamic meteorological forcings, using static features to parameterize the input gate rather than concatenating them at every timestep. This allows the network to learn site-specific temporal dynamics without overfitting individual locations.

To enhance model interpretability, we apply explainable AI (XAI) techniques including permutation-based feature importance analysis. Results reveal that air temperature and radiation dominate Tw predictions overall, while topographic attributes gain importance under thermal extremes, indicating the model captures physically meaningful process controls. Additionally, robustness tests with perturbed static inputs confirm smooth performance degradation rather than abrupt collapse, suggesting the EA-LSTM learns generalizable attribute-response relationships rather than memorizing site identities.

Both cases demonstrate how combining diverse training data with ensemble-based climate projections enables more robust predictions of hydrological extremes under climate change, while XAI methods provide transparency into learned representations.

How to cite: Sasse, A., Ludwig, R., Weiß, J., and Schütz, K.: Combining LSTMs with a Single-Model Large Ensemble for Runoff and Water Temperature Projections in Bavaria, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17916, https://doi.org/10.5194/egusphere-egu26-17916, 2026.

EGU26-18384 | Orals | HS3.6

Physics-constrained or physics-ignored? An entropy-based approach to diagnose if your hybrid model effectively skips conceptual constraints 

Anneli Guthke, Manuel Álvarez Chaves, Eduardo Acuna Espinoza, and Uwe Ehret

Despite great success of deep learning models in many applications of hydrological prediction, they still face limitations in predicting extreme events or in generalizing to unseen conditions, which raises questions about their fidelity and applicability beyond purely operational purposes. Physics-informed hybrid modelling is often proposed as a way to install interpretability and enable trustworthy data-driven predictions that are in agreement with theoretical knowledge. Yet, the community is still in search of best practices for how to construct physics-informed machine learning models – several “entry points” for physics knowledge exist, i.e., the loss function, the model inputs, or the architecture. Here, we focus on the latter, and on arguably the most “constrained” form of bringing in physics into a hybrid model: a traditional, process-based (conceptual) hydrological model is combined with a data-driven component (here: a long short-term memory network, LSTM) that modifies its parameters over time, as learned by training on observed discharge values. For this apparently well-constrained scenario of hybrid modelling, we raise the question if it can faithfully be called “physics-constrained”, or if the data-driven component is able to overwrite these constraints for the sake of increased performance.

To objectively address this question, we introduce an entropy-based method to quantify the “activity” of the data-driven component in acting against the conceptual constraints. This metric is complemented with a diagnostic workflow to better understand the internal functioning of the resulting, effective hybrid model structure in predicting discharge. Through didactic examples, inspired by real-world case studies, we present the method and build an intuition of what our entropy-based metric represents. Further, we discuss selected results from a large-sample case study on CAMELS-GB to illustrate the variety of findings and insights we had: (1) Performance heavily relies on the data-driven component, and the physics constraints often even make the prediction problem harder instead of adding helpful information; (2) the data-driven component tends to overwrite the constrained architecture “silently”, but this can be detected with our proposed workflow; (3) even nonsensical-at-first-sight constraints can in fact increase performance, as the hybrid model is transformed into a  new structure that is parsimonious and efficient; (4) claiming interpretability on the basis of prescribed constraints is risky at best – before calling a hybrid model of this type interpretable, we should carefully check what’s happening inside. Overall, these findings provide fundamental guidance towards (hybrid) model building and will help us find better ways to reconcile knowledge and information in data for trustworthy models.

How to cite: Guthke, A., Álvarez Chaves, M., Acuna Espinoza, E., and Ehret, U.: Physics-constrained or physics-ignored? An entropy-based approach to diagnose if your hybrid model effectively skips conceptual constraints, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18384, https://doi.org/10.5194/egusphere-egu26-18384, 2026.

Understanding how rainfall is transformed into streamflow is a cornerstone of hydrological science. Despite decades of progress, it remains uncertain whether physical or semi-empirical process equations formulated at the field scale can be transferred to the catchment scale without loss of realism. We assumed that this scale-mismatch is a key reason why conventional conceptual/process-based models often fail to achieve simulation accuracy comparable to purely data-driven deep learning models. Motivated by ensemble rainfall–runoff analysis (ERRA), which suggests that streamflow can be expressed as a convolution between precipitation and a nonlinear catchment response function, we develop an LSTM-based framework to learn catchment-scale response functions for each hydrological process directly from data while retaining physically consistent structure.

The proposed framework couples a generic bucket model architecture with an LSTM that acts as a nexus optimizer. Physical consistency is enforced through residual-style loss regulation, embedding mass-conservation constraints within the training objective. Within this setting, key processes, including canopy interception, infiltration, evapotranspiration, river routing, and groundwater recharge, emerge as extractable functions of meteorological forcing sequences rather than being prescribed a priori. We founded that the learned catchment-scale response functions exhibit pronounced nonlinearity and memory effects. Our results further indicate that catchment-scale process representations effectively mix field-scale empirical relationships with precipitation spatiotemporal heterogeneity, and that the deformation from field to catchment scale response function is strongly driven by the spatial heterogeneity of precipitation intensity. By restructuring the learning pathway to reduce recurrent dependencies, the framework supports efficient parallel training while maintaining physical consistency. The approach aims to simultaneously simulate streamflow and induce catchment scale response functions, offering a pathway to diagnose why conventional models fail and to advance process discovery via data-driven induction.

How to cite: Liu, C.-Y. and Hsu, S.-Y.: Deep Learning as a Nexus Optimizer: Extracting Hydrological Response functions for Rainfall-Runoff Simulation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21276, https://doi.org/10.5194/egusphere-egu26-21276, 2026.

Small watersheds play a crucial role in sustaining river hydrology, ecological flows and local water security. However, they are increasingly threatened by climate change, rapid transformations of land use and escalation of anthropogenic pressures. These problems are worse in areas with little data, where few hydrological observations, sparse monitoring networks, and inconsistent long-term datasets make it hard to accurately assess vulnerability and make plans. To address this critical gap, this study introduces a unique and data-efficient Criteria Importance Through Intercriteria Correlation- Group Method of Data Handling (CRITIC-GMDH) hybrid framework, specifically developed to accurately assess watershed vulnerability in regions where large, continuous, or high-resolution datasets are unavailable. This interpretable decision-support approach integrates CRITIC for objective indicator weighting with the nonlinear modelling capability of the GMDH, enabling robust vulnerability prediction under constrained data conditions, overcoming key limitations of conventional hydrological models and black-box machine learning techniques. The framework incorporates eleven hydro-meteorological, geomorphological, and socio-economic parameters, including rainfall, temperature, runoff, watershed area, watershed length, water quality index, average slope, forest area, impervious area, population density, and highest flood level. The approach is demonstrated across four major river basins in Northeast India, such as Gomati, Haora, Khowai, and Manu, which represent highly sensitive and partially transboundary catchments. Future climate projections from CMIP6 SSP1-2.6 and SSP5-8.5 scenarios were used to compute the Vulnerability Index across decadal periods (2005–2065). Results show a significant escalation in vulnerability, particularly under SSP5-8.5, with Haora and Gomati exhibiting Vulnerability Index > 0.85, indicating extreme exposure to climate extremes, and urbanization stress. Sensitivity analysis identifies rainfall, runoff, and temperature as dominant controlling parameters, and validation through the Falkenmark indicator and green-blue water stress indices confirms emerging scarcity risks. The study provides a scientifically grounded pathway for watershed prioritization and climate-resilient planning, offering an adaptable methodological foundation for sustainable management of small river systems in data-scarce regions.

How to cite: Rudra Paul, A. and Kumar Roy, P.: Climate-Induced Vulnerability Assessment of Small Watersheds Using a CRITIC–GMDH Hybrid Model: A Methodology Tailored for Data-Scarce Regions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4175, https://doi.org/10.5194/egusphere-egu26-4175, 2026.

Urban roads in fast-growing cities fall apart quickly, and everyone feels the impact—traffic slows down, accidents happen, and the city’s economy takes a hit. The old way of checking roads—sending people out to inspect them on foot—just doesn’t cut it anymore. It’s slow, expensive, and puts workers in harm’s way. So, we’ve built something better: an automated system that uses drones and AI to keep an eye on road conditions.

Here’s how it works. Drones fly over city streets, snapping high-resolution images that pick up everything from big potholes to tiny cracks. We run these images through our analytics pipeline. First, we use classic machine learning to weed out the stretches of road that are still in good shape. That way, the system doesn’t waste time on areas that don’t need attention.

Next, we use a deep learning model—based on YOLO, which stands for “You Only Look Once”—to hunt down and label the actual problem spots. We’ve trained this model using annotated drone photos, so it can handle tricky lighting or weird road surfaces. The model doesn’t just spot the defects—it also nails down where they are, how big they’ve gotten, and how bad the damage is.

But spotting problems isn’t enough. City agencies need to see this info and act on it, fast. So, we’ve built a web portal using OpenLayers and PostGIS that maps out every defect. Maintenance crews can sort issues by type or severity, pull up interactive maps, and even generate reports to plan repairs.

This whole setup is practical, affordable, and scales up easily for any city that wants to take road maintenance seriously. By bringing together drones, AI, and smart mapping, we’re giving city managers the real-time, reliable data they need to keep roads safe and traffic moving. And honestly, this system can help any city make smarter decisions about their roads and urban development.

How to cite: Manu, H. and Bhoopathi, S.: UAV-Based Road Defect Detection Using Hybrid Machine Learning Approach with Web GIS Visualization, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6705, https://doi.org/10.5194/egusphere-egu26-6705, 2026.

EGU26-10741 | ECS | Orals | GI2.4

Machine Learning-Based Root-Zone Soil Moisture Estimation Using Satellite-Derived Surface Soil Moisture 

siddaling Bakka and Sudardeva Narayanan

Root-zone Soil Moisture (RZSM; 10–102 cm) is a critical variable for land–atmosphere interactions, plant water availability, groundwater recharge, and hydrological extremes; however, its reliable estimation at deeper layers over large spatial scales remains challenging. Ground-based monitoring networks such as the International Soil Moisture Network (ISMN) provide accurate multi-depth soil moisture observations, but their utility is constrained by sparse station distribution, high installation and maintenance costs, and limited spatial coverage (Dorigo et al. 2011). In contrast, microwave remote sensing based satellite missions, including Soil Moisture Active Passive (SMAP), Soil Moisture and Ocean Salinity (SMOS), and Sentinel-1, offer frequent and spatially continuous SM observations but are sensitive only to near-surface conditions (top ~5 cm), leaving deeper soil layers unobserved. This disparity between depth-limited in-situ observations and surface-focused satellite measurements motivates the present study to develop a machine learning based framework to estimate RZSM from satellite-derived surface SM by incorporating temporal memory and forcing. This approach effectively captures persistence effects and vertical moisture transfer, which are essential for accurate prediction of deeper SM layers (Pal &Maity, 2019). Multi-depth SM observations from 5 to 102 cm, obtained from ISMN stations and categorized according to USDA Hydrologic Soil Groups (HSG A–D; four stations per HSG), account for differences in soil water movement and retention behaviour (Ross et al. 2018). For each soil group, Support Vector Regression (SVR) and Random Forest (RF) models were trained using a sequential, depth-wise prediction strategy comprising four depth transitions: 5–10 cm, 10–20 cm, 20–51 cm, and 51–102 cm. Model evaluation demonstrates strong predictive performance across all depth intervals (R² = 0.85–0.95 for RF and 0.63–0.95 for SVR at validation sites), indicating that HSG classification effectively captures soil-specific SM dynamics. The trained models successfully generate comprehensive RZSM profiles using satellite-derived SM from the SMAP mission.These profiles are rigorously validated against ground-based observations and demonstrate strong applicability across diverse landscapes lacking direct subsurface measurements.

How to cite: Bakka, S. and Narayanan, S.: Machine Learning-Based Root-Zone Soil Moisture Estimation Using Satellite-Derived Surface Soil Moisture, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10741, https://doi.org/10.5194/egusphere-egu26-10741, 2026.

EGU26-11280 | Orals | GI2.4

Combining different hydraulic methods to estimate the discharge from Combined Sewer Overflows (CSO) into streams. 

Michael Robdrup Rasmussen, Mathias Ulsted Jackerott, Janni Mosekær Nielsen, Ida Kemppinen Vestergaard, and Jesper Ellerbæk Nielsen

Combined Sewer Overflows (CSOs) in cities can play a significant role in the morphology and hydraulic performance of streams near urban areas. A complete urban drainage system is often modelled by a dedicated hydrological/hydraulic model (e.g., SWMM or Mike+). However, these models must be calibrated against observations. Especially the flow from CSOs is difficult to estimate. The quality of the data and the drainage models depend on the accuracy of the overall mass balance of the drainage system. If it is not possible to estimate the discharge from, for example, a CSO, the results from other parts of the system become unreliable.

This research evaluates flow dynamics through a multi-methodological approach where the CSO is evaluated by theoretical models, CFD models, experimental work in a laboratory, and a new innovative method where the noise from a CSO is analyzed. The sound is both analyzed directly and by training a machine learning model on the laboratory experiments. The result is a hybrid model filtering all the estimates to one flow estimate. CFD has been used to model the specific CSO to take a Q-h relationship into account, and to generate a so-called catalog method. In this method, multiple variations of geometry are simulated in a free-surface CFD model to cover many different geometries, and general equations are extracted from these simulations.

The hybrid approach opens the door to a new way of estimating interactions between the urban water cycle and the receiving waters. Applying edge processing makes it possible to continuously adapt to local conditions that were not present during the calibration and validation of the model. Edge processing involves signal processing and modeling at the measuring point, where the maximum bandwidth of the sensor data is available and can be used for the most accurate data estimation.

How to cite: Rasmussen, M. R., Jackerott, M. U., Nielsen, J. M., Vestergaard, I. K., and Nielsen, J. E.: Combining different hydraulic methods to estimate the discharge from Combined Sewer Overflows (CSO) into streams., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11280, https://doi.org/10.5194/egusphere-egu26-11280, 2026.

EGU26-12001 | Orals | GI2.4

Evaluating Climate Change Impacts and Adaptation Options for Paddy Yield Using Data-Curated Modelling in Goa, India 

Ankit Balvanshi, Jayakumar Kv, and Venkappayya r Desai

This study investigates the coastal-region impacts of climate change on rice yield in Goa, India, a monsoon-driven agroecosystem highly dependent on paddy cultivation and vulnerable to rainfall variability, salinity intrusion, and rising temperatures. The study aims to (i) estimate future crop evapotranspiration (ETc) and rice yield projections under different Shared Socioeconomic Pathways (SSP 2.6, SSP 4.5, and SSP 8.5), and (ii) assess the effectiveness of adjusting planting dates, along with the integration of drought-resilient cultivars, alternate wetting and drying (AWD) irrigation, and soil management practices, as adaptation strategies to mitigate yield reductions. To achieve these objectives, the CropWat and AquaCrop models were employed, using statistically downscaled CMIP6 CESM2 climate data.

The AquaCrop model was calibrated using data from 1994 to 2004 and validated for the period 2005–2014, demonstrating strong performance metrics (Nash–Sutcliffe Efficiency = 0.86, RMSE = 278.5, r² = 0.93). Our findings indicate that projected climatic changes pose a significant threat to rice yield stability in the region. Rising temperatures and shifting monsoon patterns are expected to elevate evapotranspiration demand by 10–14%, thereby intensifying irrigation requirements even in high-rainfall areas.

In response, adjusting planting dates emerged as a promising adaptation strategy. Specifically, delaying planting by 5 days until 2070 and by 10 days from 2071 to 2099 significantly mitigated yield declines across all SSP scenarios. An optimum 10-day delay in planting was found to recover up to 17% of yield losses under SSP 2.6 and SSP 4.5. Furthermore, compound strategies—including drought-tolerant rice cultivars, AWD irrigation, and improved soil management—provided up to 25% additional yield gains. These integrated approaches not only improved crop water productivity but also stabilized yields under moderate emission pathways. However, under the high-emission SSP 8.5 scenario, yield reductions remained substantial (up to 20%) due to increased temperature stress and shortened grain-filling duration, underscoring the limits of adaptation under extreme climate conditions.

The results highlight the importance of temporally optimized sowing schedules, integrated irrigation management, and improved soil practices for enhancing the resilience of coastal rice systems. This study further demonstrates that reliable data curation, model calibration, and parameter selection are essential to improving predictive accuracy in agro-hydrologic modelling. The findings emphasize the need for consistent methodological frameworks that couple climate projections with process-based crop models to assess adaptation effectiveness under uncertain future conditions.

Overall, the study provides actionable insights for strengthening the accuracy and reliability of water- and climate-based agricultural modelling frameworks. The outcomes contribute to developing climate-resilient strategies for paddy cultivation in coastal India, reinforcing the broader understanding of model validation, uncertainty reduction, and data-driven adaptation in hydrologic and agricultural research.

How to cite: Balvanshi, A., Kv, J., and Desai, V. R.: Evaluating Climate Change Impacts and Adaptation Options for Paddy Yield Using Data-Curated Modelling in Goa, India, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12001, https://doi.org/10.5194/egusphere-egu26-12001, 2026.

EGU26-12042 | ECS | Posters on site | GI2.4

Strategies for spatial leave-one-out cross-validation 

Cristina Olimpia Chavez Chong, Cécile Hardouin, and Ana Karina Fermin Rodriguez

The purpose of the talk is to discuss spatially adapted cross-validation methods that maintain sufficient separation between training and validation sets, thus providing more accurate estimates of model risk. We begin by reviewing various spatial cross-validation techniques, including spatial blocked cross-validation and spatial leave-one-out, under scenarios of low to strong spatial dependence. We then propose a practical framework for determining an optimal “buffer size” for spatial leave-one-out that reduces autocorrelation between training and validation subsets. This framework is further enhanced by a parametric bootstrap approach designed to approximate the true risk in single-realization settings. Simulation experiments confirm that these methods effectively capture the underlying spatial structure, leading to more reliable risk estimation.

How to cite: Chavez Chong, C. O., Hardouin, C., and Fermin Rodriguez, A. K.: Strategies for spatial leave-one-out cross-validation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12042, https://doi.org/10.5194/egusphere-egu26-12042, 2026.

EGU26-13261 | ECS | Orals | GI2.4

Flow Modulation and Wave Impact Reduction by Retreated Crown Walls in Vertical Breakwaters 

Shaik Firoj and Mohammad Saud Afzal

This study investigates wave-induced flow behaviour around vertical breakwaters with retreated crown wall using numerical simulations. Previous experimental work has shown that moving the crown wall landward can reduce wave forces, moments, and overtopping. However, the associated flow mechanisms near the wall and trunk region have not been examined in detail. In this work, the open-source CFD model REEF3D is used to simulate regular wave interaction for crown wall retreat configuration. The model solves the Reynolds-averaged Navier–Stokes equations, with a level set method for free-surface tracking and a k–ω turbulence closure. The numerical results are first validated against published experimental data to ensure accuracy. The simulations provide detailed information on velocity fields, vortex formation, and flow separation during wave impact and overtopping. The results show that retreating the crown wall modifies the local flow structure, leading to a redistribution of momentum and a reduction in direct wave impact on the wall. These findings help to clarify the hydrodynamic role of retreated crown wall in vertical breakwater design.

How to cite: Firoj, S. and Afzal, M. S.: Flow Modulation and Wave Impact Reduction by Retreated Crown Walls in Vertical Breakwaters, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13261, https://doi.org/10.5194/egusphere-egu26-13261, 2026.

Watershed hydrodynamics is governed by various hydrological flow processes that occur at different spatiotemporal scales. Most hydrological models couple the surface flow solver with the standard empirical infiltration models for flood propagation modeling. However, the empirical infiltration models are not applicable for heterogeneous and anisotropic soils and shallow groundwater tables, which are most vulnerable to waterlogging problems. Hence, simultaneous and integrated modeling of the surface and subsurface flow processes is essential for the continuous monitoring of watershed hydrodynamics. A physically based unified multi-region, multi-process watershed model integrates the various hydrological flow components in different regions through unique coupling mechanisms at the interfaces. The current work presents a Finite Volume (FV) method-based watershed flow model developed using the OpenFOAM® framework [1]. The developed model framework utilizes the ‘multi-region’ structure from the OpenFOAM® library to integrate the OpenFOAM®-based solvers for the individual processes of surface overland flow [2,3] and saturated-unsaturated subsurface flow [4] through the imposition of appropriate interface boundary conditions or addition of source/sink terms at the interfaces of the flow regions. The surface flow component is modeled using the diffusive wave or the zero-inertia (ZI) approximation of the two-dimensional (2D) depth-averaged shallow water equations (SWE). On the other hand, the flow through the variably saturated subsurface media is modeled using the ‘mixed form’ of the 3D modified Richards Equation. The flux exchange between the surface and subsurface regions (infiltration or exfiltration rate) is modeled using a switching algorithm to impose the boundary condition on the interface between the two regions. The algorithm changes the interface to a Dirichlet or a Neumann type boundary condition based on the rainfall intensity and the saturated hydraulic conductivity of the ground surface. A stabilized and adaptive time-stepping algorithm has been implemented to ensure smooth convergence of the iterative technique used for linearizing the nonlinear governing equations. The developed model is equipped with parallelization strategies to be run on multi-core processors, which is essential for increased computational efficiency while solving regional-scale watershed flow problems. The developed watershed model has been verified and validated against the standard benchmark problems on saturation excess and infiltration excess from the literature. Moreover, the applicability of the developed model has been extended to solve complex hydrological problems on exfiltration occurring over natural catchments, yielding satisfactory results.

References

[1] Jasak, H., A. Jemcov, Z. Tukovic. (2007). OpenFOAM: A C++ library for complex physics simulations. In Vol. 1000 of Proc., Int. Workshop on Coupled Methods in Numerical Dynamics,1–20. Dubrovnik, Croatia: Inter-University Center

[2] Dey, S., Dhar, A. (2024). Applicability of Zero-Inertia Approximation for Overland Flow Using a Generalized Mass-Conservative Implicit Finite Volume Framework. Journal of Hydrologic Engineering, 29(1), 04023042.

[3] Dey, S. (2025). zeroInertiaFlowFOAM – a OpenFOAM®-based computationally efficient, mass-conservative, implicit zero-inertia flow model for flood inundation problems on collocated grid-systems (No. EGU25-17402). Copernicus Meetings.

[4] Dey, S., & Dhar, A. (2022). Generalized mass-conservative finite volume framework for unified saturated–unsaturated subsurface flow. Journal of Hydrology, 605, 127309.

How to cite: Dey, S. and Dhar, A.: An OpenFOAM®-based coupled surface-subsurface flow model for simulating watershed hydrodynamics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13540, https://doi.org/10.5194/egusphere-egu26-13540, 2026.

EGU26-16374 | ECS | Orals | GI2.4 | Highlight

Reorganization of Heatwave Day Regimes across India under Recent and Near Future Warming 

Srikanth Bhoopathi and Manali Pal

Heatwaves are among the most rapidly intensifying climate extremes over India, yet their evolving spatial characteristics under recent and near future climate change remain inadequately quantified. This study examines the spatio-temporal variability of Heatwave Days (HWDs) across India using daily maximum temperature from the India Meteorological Department (IMD) gridded dataset for the historical period 1975-2024 and extends the analysis to the near future (2025-2044) using CMIP6 climate projections. Heatwave days are identified at each grid point using a calendar day based percentile approach, where daily maximum temperature exceeding the local 95th percentile threshold for the same calendar day, computed over a fixed reference period of 1981-2010, is classified as a heatwave day. Grid wise cumulative and decadal HWDs are analysed to assess long-term exposure and spatial redistribution. To objectively identify dominant heatwave regimes, Self-Organizing Maps (SOMs) are employed using multiple HWD metrics, enabling classification of regions with distinct heatwave characteristics and temporal evolution. Observational results indicate a clear reorganization of heatwave patterns over India. During the late 20th century (1975-1994), HWD accumulation is largely limited to north-western and parts of central India, typically ranging between 26 to 50 days per decade, with most eastern and peninsular regions experiencing fewer than 25 HWDs. From the mid-1990s onward, a pronounced intensification and spatial expansion is evident. By 2005-2014, large parts of central and eastern India exhibit decadal HWDs in the range of 51 to 100 days. The most recent decade (2015-2024) shows widespread moderate to high HWDs accumulation across the country, with several regions of central, eastern, and peninsular India experiencing 101 to 150 HWDs, and localized hotspots exceeding 150 days per decade. Future HWDs for 2025-2044 are derived from daily maximum temperature projections of the MPI-ESM1-2-HR model under the SSP2-4.5 scenario. The near-future decadal projections (2025-2034 and 2035-2044) indicate a continued intensification and spatial expansion of HWDs, with extensive areas of north-western, central, and peninsular India experiencing 151 to 250 HWDs per decade, and emerging hotspots exceeding 250 to 350 days, particularly over parts of north-western and southern India. Eastern India also shows a marked transition toward higher HWDs classes, indicating increasing regional vulnerability. Overall, the combined observational and CMIP6 based analysis demonstrates a transition toward widespread and persistent heatwave exposure across India in both recent decades and the near future. The integration of a grid specific, calendar day based percentile definition with SOM based classification provides a robust framework for identifying evolving heatwave regimes and supports improved heat risk assessment, climate adaptation planning, and early warning strategies under continued warming.

How to cite: Bhoopathi, S. and Pal, M.: Reorganization of Heatwave Day Regimes across India under Recent and Near Future Warming, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16374, https://doi.org/10.5194/egusphere-egu26-16374, 2026.

Considering the dearth of gauge-based rainfall observations at desirable resolution, it becomes immensely challenging to quantify and monitor droughts, especially over the developing countries. This can be circumvented by utilizing the high-resolution open-access rainfall products. This study is envisaged with the objective to assess the spatiotemporal variation of meteorological droughts over the Bundelkhand region, India. The multi-source weighted-ensemble precipitation (MSWEP), a blended product of global gauge-based, satellite-based and reanalysis precipitation datasets, is utilized for a period of 44 years (1980-2023). The MSWEP rainfall is bias-corrected with respect to the India Meteorological Department (IMD) gridded observation dataset for the 14 districts in the region. Using the corrected rainfall product, the droughts over each district are characterized by Standardized Precipitation Index (SPI) at three different timescales, i.e., the SPI-3, SPI-6 and SPI-12 are used to model short-term, intermediate-term and long-term droughts, respectively. A drought severity index (DSI) is proposed considering the probability of droughts in different severity classes (i.e., near-normal, moderate, severe and extreme). Further, the trend analysis of SPI at different timescales is carried out using Modified Mann-Kendall (MMK) test. The results reveal the MSWEP dataset’s problems in capturing higher quantiles, which affects the probabilistic distribution used for quantifying drought events. However, the bias-corrected MSWEP product showed an excellent match with the IMD gridded data, thereby substantiating its applicability over the Bundelkhand Region. The region is found to be prone to droughts with an increasing trend of dryness. The novel approach of DSI is found to distinguish the drought severity levels at district-scale, which can be helpful for planning and management of droughts. Overall, this study provides critical insights on the drought characterization using state-of-the-art datasets and innovative approaches, which can also be extended to other drought-prone regions of the world.

 

Keywords: Bias-correction; Bundelkhand; DSI; MSWEP; MMK; SPI

How to cite: Swain, Dr. S.: A statistical approach of mapping drought severity using bias-corrected blended dataset over a semi-arid region, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18814, https://doi.org/10.5194/egusphere-egu26-18814, 2026.

EGU26-18906 | Orals | GI2.4

Passive Acoustic Characterization of Marine Bedload Transport Based on Interparticle Collision Dynamics 

Debasish Dutta, Armelle Jarno, Hugues Besnard, Bruno Morvan, and Francois Marin

Marine sediments are very important for keeping the coast stable and protecting the shoreline naturally. However, anthropogenic activities can greatly change how sediment moves, making their accurate monitoring essential. In marine settings, understanding bedload sediment transport can be challenging due to conventional methods reliant on visual observations or direct sediment sampling tend to be intrusive, spatially constrained, and inadequate for long-term or continuous monitoring. In this situation, passive underwater acoustics is a promising non-intrusive option that can provide continuous monitoring with high temporal resolution. This study investigates the acoustic signatures related to marine bedload transport, focusing particularly on the sounds generated by interparticle collisions of mobile sediments. A series of controlled laboratory experiments are performed utilising simplified experimental arrangements in which artificial sediments (spherical glass beads) are mobilised under oscillatory motion that simulates wave-induced seabed forcing. We use glass beads of different sizes to create idealised bedload conditions, and we use an oscillating plate to control the movement of the particles. Hydrophones placed close to the sediment bed record acoustic pressure signals. The recorded acoustic signals are analyzed in both the time and frequency domains. Individual particle impacts are characterised by short transient acoustic events, and spectral analyses show clear peak frequencies that are linked to sediment motion. The results indicate that the peak frequency of the acoustic spectrum is predominantly determined by particle diameter and is additionally influenced by the amplitude and frequency of the applied oscillatory motion. These observations align with theoretical models, such as those suggested by Thorne (1985), that explain the generation of pressure waves during underwater particle collisions. To further explore the mechanisms of sound generation, experiments are conducted with both smooth and rough beds below the beads layers. The analysis reveals the existence of sediment-specific acoustic signatures, facilitating the differentiation of particle sizes according to their spectral characteristics. This study illustrates the significant potential of passive acoustic methods for the remote monitoring of marine bedload transport. The study offers novel insights into sound generation mechanisms linked to sediment motion across various particle sizes, motion amplitudes, and bed configurations, utilising a combination of laboratory experiments, theoretical frameworks, and comprehensive spectral analysis, with direct implications for intricate coastal and offshore environments.

How to cite: Dutta, D., Jarno, A., Besnard, H., Morvan, B., and Marin, F.: Passive Acoustic Characterization of Marine Bedload Transport Based on Interparticle Collision Dynamics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18906, https://doi.org/10.5194/egusphere-egu26-18906, 2026.

EGU26-95 | ECS | Orals | HS2.1.3

Identification of Groundwater Potential Zones Using GIS and Multi-Criteria Decision-Making Techniques: A Case Study of the Shabelle River Basin (Somalia) 

Ismail Mohamoud Ali Alasow, Mahad Abdullahi Hussein, Sanjay Kumar Tiwari, and Rajeev Bhatla

In addition to supplying the water that people need daily, groundwater also affects agricultural methods, preserves natural balance, and promotes industrial development. The 108,300 km2 Shabelle River Basin served as the site of the current study. Monitoring, evaluating, and conserving groundwater supplies for water resource management and development is made possible by the effective integration of remote sensing data and GIS in hydro-geological research. The Shabelle basin area's Ground Water Potential Zones were defined by combining seven thematic layers—geology, land use/land cover, drainage density, slope, lineament density, rainfall distribution map, and soil map—into a GIS platform using the spatial analyst tool in Arc GIS 10.8. The analytical hierarchy process (AHP) technique is used to find the weighted values for each parameter and its sub-parameters based on the relative importance of the influencing elements for groundwater recharge. Four groups were identified on the final groundwater potential zonation map of the study area: low potential zones of 1,548.7 km2 (1.43%), moderate potential zones of 25,786.23 km2 (23.81%), high potential zones of 22,353.12 km2 (20.64%), and very high potential zones of 55,341.3 km2 (54.10%). According to this study, high and very high groundwater potential zones dominate in the basin in 75% of the entire studied region. These zones are found in the basin's northern and central regions, where low slopes, fractured geological formations, and porous soil are present. However, because to their steep slopes, strong geological formations, and low rainfall zones, the south and southwest regions of the basin have poor potential zones. When well data was utilized to validate the accuracy of this data, there was a high degree of agreement between the expected and observed well performance. The Shabelle river basin's water management policies, effective use of natural resources, physical design, and sustainable groundwater development should all benefit greatly from the findings, particularly as the adverse effects of climate change on human life become closer. Anywhere else in the world, the study's methodologies can be used. The findings of this study can be applied to future research on agriculture, basin management, sustainable groundwater, and the interaction between groundwater and climate change.

 

How to cite: Alasow, I. M. A., Hussein, M. A., Tiwari, S. K., and Bhatla, R.: Identification of Groundwater Potential Zones Using GIS and Multi-Criteria Decision-Making Techniques: A Case Study of the Shabelle River Basin (Somalia), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-95, https://doi.org/10.5194/egusphere-egu26-95, 2026.

EGU26-841 | ECS | Orals | HS2.1.3

Transfer Learning for Hydrological Modelling and XAI-Based Physical Consistency Assessment in Reconstructing Streamflow Time Series in Data-Scarce Regions 

André Rodrigues, Tais Maia, Matheus Macedo, Rodrigo Perdigão, Julian Eleutério, and Bruno Brentan

Accurate streamflow monitoring is essential for water resources management, yet many Brazilian watersheds lack sufficiently long historical records to support effective decision-making. This challenge is particularly critical in the Metropolitan Region of Belo Horizonte (RMBH), which depends on major reservoirs located within its territory – such as Rio Manso, Serra Azul, Vargem das Flores, and the Ibirité (REGAP) reservoir – for industrial and domestic water supply. Several of these strategic systems suffer from limited or inconsistent hydrological monitoring, complicating operational planning, increasing the risk of water shortages and of compromising reservoirs flow outcome capacity. Transfer Learning (TL) with Long Short-Term Memory (LSTM) networks emerges as a promising strategy to overcome this limitation, enabling the development of hydrological models in watersheds with little or no historical data. This study investigates the application of TL to enhance daily streamflow prediction in data-scarce basins of the Metropolitan Region of Belo Horizonte (RMBH), while assessing the optimal length of local streamflow records required to improve hydrological modelling through fine-tuning of a regional TL model. For this, 23 watersheds with similar hydrological behaviour and geomorphological characteristics were previously selected in the RMBH to evaluate the feasibility of reconstructing streamflow time series in data-scarce regions. Satellite-derived products and reanalysis datasets were employed as inputs to overcome limitations in hydrometeorological data availability. Furthermore, eXplainable Artificial Intelligence (XAI) methods are employed to explore the physical feasibility of knowledge transfer, with the potential to identify which watershed attributes – such as drainage area, elevation, soil-moisture dynamics, land-use composition, and climatic seasonality – most strongly influence whether hydrological behaviour learned in source basins can be meaningfully transferred to target basins. Significant performance gains can be achieved with only one to two years of local data, allowing accurate models to be developed rapidly even in newly monitored watersheds. This improves considerably the decision-making in data scarce regions, primarily those ones with some water conflicts. XAI analyses confirmed the physical soundness of the predictions, supporting more reliable streamflow reconstruction. However, further methodological improvements are required, as some watersheds were unable to benefit from transfer learning. Overall, TL represents a powerful direction for streamflow modelling in regions with limited monitoring, while XAI provides a framework to understand the physical consistency of the transferred knowledge and to determine the minimum monitoring effort required to build reliable local models.

How to cite: Rodrigues, A., Maia, T., Macedo, M., Perdigão, R., Eleutério, J., and Brentan, B.: Transfer Learning for Hydrological Modelling and XAI-Based Physical Consistency Assessment in Reconstructing Streamflow Time Series in Data-Scarce Regions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-841, https://doi.org/10.5194/egusphere-egu26-841, 2026.

Physics-Informed Neural Networks (PINNs) offer a promising framework for groundwater modeling in regions where hydrogeological data are limited. However, their performance significantly depends on the choice of constraint weights associated with governing equations and derivative-based regularizations. In this study, we develop a constraint-weight selection strategy for PINNs to simulate groundwater head dynamics in data-sparse environments where aquifer properties such as hydraulic conductivity (K) and specific yield/storativity (S) are unavailable. The proposed formulation incorporates first-, second-, and third-order spatial and temporal derivatives of hydraulic head and aquifer properties into the PINN loss function, enabling the model to capture fine-scale spatiotemporal variations without explicit knowledge of subsurface parameters. The approach is applied to a small section of the Varuna River Basin, using groundwater-level observations collected from 37 monitoring stations between 2022 and 2024. The dataset contains several missing values that the PINN framework handles seamlessly, unlike conventional simulation models such as MODFLOW, which require complete and continuous input fields for stable execution. An iterative optimization scheme is employed to balance data fidelity, physical constraints, and derivative-based regularization during training. The proposed method achieves a training R² of 0.986 and a testing R² of 0.947, with corresponding RMSE values of 0.721 and 1.416 meters, respectively. These results demonstrate that adaptive constraint weighting significantly improves prediction accuracy, robustness, and convergence compared to fixed-weight PINN formulations. Overall, the study highlights the potential of derivative-enhanced PINNs for groundwater modeling in data-sparse aquifers and provides a generalized framework for physics-guided learning under missing or incomplete observations.e data scarcity.

How to cite: Bajpai, M., Gaur, S., and Singh, K.: Derivative-Enhanced Constraint Weights for PINNs in Groundwater Flow Modeling Under Unknown Aquifer Properties, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-896, https://doi.org/10.5194/egusphere-egu26-896, 2026.

EGU26-1536 | Posters on site | HS2.1.3

Large-scale streamflow regionalization in ungauged West African catchments: How do classical and deep learning approaches compare? 

Yves Tramblay, Serigne Bassirou Diop, Fadilath Kate, Issam Souassi, Bastien Dieppois, Ansoumana Bodian, Joris Guerin, Renaud Hostache, Anne Johannet, Frederik Kratzert, Ludovic Oudin, Vianney Sivelle, and Kalil Traoré

In West Africa, limited access to hydrometric data remains a major challenge for advancing surface water research and improving water management. Since the early 1980s, many gauging stations have been decommissioned, leaving gaps in reliable streamflow records across numerous catchments. Parameter regionalization of hydrological models is commonly employed to enable runoff prediction in ungauged catchments. This study represents an assessment of rainfall-runoff model regionalization across West Africa. We used an unprecedented dataset of 189 near-natural catchments to compare two contrasting approaches: (i) a benchmark conceptual modeling framework using the GR4J model, regionalized with three parameter-transfer techniques (spatial proximity, physiographic similarity, and Random Forest), and (ii) a data-driven framework based on Long Short-Term Memory (LSTM) neural networks. Using a leave-one-out resampling approach, regionalization approaches were evaluated using different performance metrics: (i) the Kling-Gupta Efficiency (KGE), calculated between simulated and observed streamflows, (ii) the relative bias (rBias) on several hydrological signatures computed with observed or simulated discharge and (iii) the difference between observed and simulated flood quantiles. Results show that the conceptual modeling approach with traditional parameter-transfer techniques consistently underperforms compared to the LSTM, failing to reproduce key hydrological signatures. In contrast, the LSTM model showed better generalization performance, accurately simulating streamflow with a median KGE of 0.67 and reliably capturing hydrological signatures and flood quantiles across West Africa’s diverse climates and landscapes with lower biases. These findings highlight the potential of data-driven approaches to enhance hydrological prediction in data-scarce regions, supporting more effective flood risk management and water resource planning.

How to cite: Tramblay, Y., Diop, S. B., Kate, F., Souassi, I., Dieppois, B., Bodian, A., Guerin, J., Hostache, R., Johannet, A., Kratzert, F., Oudin, L., Sivelle, V., and Traoré, K.: Large-scale streamflow regionalization in ungauged West African catchments: How do classical and deep learning approaches compare?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1536, https://doi.org/10.5194/egusphere-egu26-1536, 2026.

EGU26-4669 | ECS | Orals | HS2.1.3

Comparison of expert-knowledge and machine learning approaches for mapping groundwater-dependent ecosystems in a regional setting in Central Mexico 

A. Camila Salgado-Albiter, Selene Olea-Olea, Nelly L. Ramírez-Serrato, Eric Morales-Casique, Lorena Ramírez-González, and Aurora G. Llanos-Solis

Intensive groundwater abstraction, land-use changes, and climate variability have significantly altered natural discharge and flow patterns within groundwater systems, threatening long-term groundwater sustainability. These disruptions increase the risk of degradation in ecosystems that rely directly or indirectly on groundwater discharge, i. e. groundwater-dependent ecosystems (GDEs).

Mexico is particularly vulnerable to declining water table levels, a situation accelerated by gaps in groundwater management that fail to incorporate GDEs into decision-making processes. This issue is especially critical in northeastern Michoacán, home to two of the country’s largest lakes: Pátzcuaro and Cuitzeo lakes, which represent a key study area for studying growing threats to GDEs caused by pollution, climate change, and intensive groundwater abstraction. In order to preserve GDEs, along with their associated biodiversity and ecosystem services, accurate mapping is essential to secure their future integration into groundwater sustainability policies and conservation initiatives.

To address this issue, we compared four methods usually used in geospatial mapping: the Analytical Hierarchy Process (AHP), Weights of Evidence (WoE), and two machine learning models: Logistic Regression (LR) and Random Forest (RF), using environmental variables associated with GDE presence obtained from geospatial data and remote sensing products.

Model performance was evaluated using a validation dataset derived from local inventories and fieldwork conducted in 2024, applying Receiver Operating Characteristic (ROC) curves and the Area Under the Curve (AUC) metric. Results showed that RF (AUC = 0.82) and LR (AUC = 0.70) outperformed WoE (AUC = 0.61) and AHP (AUC = 0.59), with RF demonstrating the highest predictive accuracy and best performance in cross-validation folds.

The GDEs prediction map derived from RF highlights areas primarily along the shores of both lakes, where volcanic lithology contacts with lacustrine deposits, inducing groundwater discharge through springs that sustain wetlands. Additional GDEs areas occur along fault zones that enhance discharge within volcanic lithology near Morelia City and in perennial streams located at intermediate elevations.

The study faces limitations related to varying spatial resolutions, independent errors in geospatial datasets, and uneven data quality across local zones within the study area. Furthermore, the absence of direct field verification for areas with the highest predicted GDE potential constrains the overall impact of the study. Nevertheless, this research provides significant evidence of the advantages of using machine learning approaches in regions lacking detailed hydrogeological information, supporting the integration of GDEs into groundwater sustainability management.

 

How to cite: Salgado-Albiter, A. C., Olea-Olea, S., Ramírez-Serrato, N. L., Morales-Casique, E., Ramírez-González, L., and Llanos-Solis, A. G.: Comparison of expert-knowledge and machine learning approaches for mapping groundwater-dependent ecosystems in a regional setting in Central Mexico, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4669, https://doi.org/10.5194/egusphere-egu26-4669, 2026.

EGU26-4811 | Posters on site | HS2.1.3

Graph-based machine learning approach for river water quality prediction under data limitations 

Sueryun Choi, Eun-hee Jung, Hyeong-Soon Shin, Jin-Ho Song, Hanjo You, HaeJun Son, Intae Choi, Jihoon Yang, and Hee-Cheon Moon

Accurate prediction of river water quality is essential for effective watershed management, yet it is often hindered by practical monitoring constraints, including infrequent grab sampling (e.g., monthly observations) and the lack of reliable streamflow data. These limitations restrict the applicability of conventional process-based water-quality models and necessitate alternative analytical tools. In this study, we propose a graph-based machine learning framework that integrates prediction and diagnostic analyses of river water quality, with chromaticity prediction in the Hantan River Basin, Republic of Korea, as a case study. Graph-based models outperformed purely temporal baselines, with the Graph Sample-and-Aggregate (GraphSAGE) model achieving a test R² of 0.82. Its sampling-based spatial aggregation integrates localized and distributed upstream information across the river network, allowing the model to capture nonlinear relationships mediated by implicit flow connectivity. Graph explanation analyses using PGExplainer identify the SC sub-watershed as the dominant pollution source and primary intervention area. In addition, feature attribution analyses distinguish persistent long-term drivers (e.g., TOC associated with major wastewater treatment plant discharges) from short-term episodic influences linked to facility-specific effluent spikes. Overall, these results demonstrate that graph-based machine learning can serve as a useful framework for both prediction and diagnostic interpretation of key water-quality drivers in data-limited river systems.

How to cite: Choi, S., Jung, E., Shin, H.-S., Song, J.-H., You, H., Son, H., Choi, I., Yang, J., and Moon, H.-C.: Graph-based machine learning approach for river water quality prediction under data limitations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4811, https://doi.org/10.5194/egusphere-egu26-4811, 2026.

EGU26-5433 | ECS | Posters on site | HS2.1.3

Proposing a Deep Learning based Regional Goodness-of-Fit test for identification of regional distribution  

Sukhsehaj Kaur and Sagar Rohidas Chavan

Regional frequency analysis relies heavily on robust goodness-of-fit (GOF) testing for selecting an appropriate probability distribution, which directly influences the accuracy of estimated quantiles. However, existing statistical approaches often involve strong assumptions and computational overheads that limit their effectiveness, particularly for large regional datasets. The widely used L-moment-based approach requires scaling each site’s data by its own mean, which raises concerns about potential distortion of the original distributional characteristics. To overcome this limitation, the present study proposes a novel Deep Learning (DL)-based GOF test that identifies the regional distribution without performing mean-based scaling. The proposed methodology employs a Deep Neural Network (DNN) trained to classify regional distributions based on the distinctive behavior of Generalized Extreme Value, Generalized Pareto, Generalized Logistic, Generalized Normal, and Pearson Type III distributions under specific mathematical transformations. These transformations yield distribution-specific signatures that form the basis of the DNN training process. For a given dataset, the transformations are applied, and kernel density estimates derived from the transformed data are used as inputs to a pre-trained DNN model to identify the most suitable regional distribution. The DNN classifier achieved an accuracy of 95.09% on the training dataset and 94.86% on the test dataset. A comprehensive simulation study was conducted for multiple regional configurations to assess the performance of the proposed DL-based GOF test. The results were compared against the conventional L-moment-based GOF approach. The proposed method demonstrated comparable classification accuracy for smaller region sizes and marginally improved accuracy for larger datasets. The proposed DL-based GOF framework shows significant promise, particularly due to its substantially lower computational cost compared to the conventional L-moment methodology. The findings suggest that this approach can facilitate accurate and efficient estimation of quantiles, thereby supporting informed decision-making planning, management and risk assessment.

How to cite: Kaur, S. and Chavan, S. R.: Proposing a Deep Learning based Regional Goodness-of-Fit test for identification of regional distribution , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5433, https://doi.org/10.5194/egusphere-egu26-5433, 2026.

Lakes are the essential asset for the inhabitants of our planet since these are vital sources of water. It is understood that these lakes become more crucial in the regions where water is not easily available such as in Himalayas, drought-prone, and arid regions. However, it has been noticed that the dual problems have arisen at the same time due to climate change, i.e., water scarcity in the arid or drought-prone regions due to rapid extinction of some of the lakes and flood devastations in Himalayan due to overtopping of water from the vulnerable lakes. Climate Change extremes cannot be blamed alone for the extinction of these lakes while overexploitation, improper maintenance and non-civic senses have also exaggerated the process. While the catastrophic events due to these lakes called Glacier Lake Outburst Floods (GLOFs) are mostly occurring due to extremes rainfall events causing regular expansion and contraction of the lakes. However, these extreme events are more intense and frequent due to climate change and tends to increase in the future, making these lakes more vulnerable and responsible for such events.  It is essential to monitor the lake water dynamics not only for sustainable water resources management but also for mitigating future catastrophic event risk arising due to these lakes. While the monitoring of lakes is not always easy either due to data-scarcity in the catchments or impossible in-situ measurements due to inaccessible catchment terrain like in Himalayas. The availability and accessibility of advanced remote satellite sensing data such as altimeter, and space-borne Light Detection and Ranging (LiDAR) have been enabled us lake monitoring, however, their processing demands modern approaches. Hence, the present study aims to develop a machine learning model integrated with geospatial approach to process these advance remote sensing data for the spatio and temporal monitoring of water dynamics of lakes. The present study utilizes Icesat-2 as space-borne LiDAR and Surface Water and Ocean Topography (SWOT) as wide swath altimeter data. The study provides a reliable and precise remote sensing derived Water Surface Elevation (WSE) for the lakes at spatial and temporal scales. The derived WSE for lakes would help us to identify the vulnerable lakes and to evolve robust policies to solve dual lake problems at greater extent, i.e., water scarcity in drought or arid-prone regions as well as in the regions like Himalayas for mitigating catastrophic events due to glacier lakes. Further, the developed model would be easily applicable to any lake while the finer adjustment may be required due to different topographic conditions. 

Keywords: Lake water dynamics, Space-borne LiDAR, Altimeter, Machine Learning, and Geospatial.

How to cite: Ranjan, R., Rai, A. K., Dhote, P. R., and Keshari, A. K.: Leveraging Advanced Remote Sensing with Machine Learning and Geospatial Techniques for Spatio-Temporal Monitoring of Lake Water Dynamics in Inaccessible and Data-Scarce Catchments , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6020, https://doi.org/10.5194/egusphere-egu26-6020, 2026.

Water quality monitoring in subsurface environments is often limited by sparse, irregular, and uncertain measurements, complicating the calibration process and reliability of transport models. In this study, we propose a Finite Volume (FV) residual Physics Informed Neural Network (PINN) framework for contaminant transport through subsurface media governed by the advection-dispersion equation (ADE), with a focus on generating predictions considering parameteric uncertainty for data-scarce environments. The core idea is to replace the strong-form PDE residual typically used in PINNs with a control-volume conservation imbalance derived from a discrete FV balance. Neural network predictions are used to evaluate advective and dispersive numerical fluxes at cell faces, and training minimizes the resulting cell-wise flux imbalance while enforcing initial and boundary conditions. This conservative formulation enables transport-specific numerical flux treatments (e.g., upwind/TVD advection and consistent boundary fluxes), and we assess performance for advection-dominated systems with sharp concentration fronts. 

To represent heterogeneity and uncertainty in dispersion, we parameterize the dispersion coefficient as a strictly positive random field using a low-dimensional basis. Uncertainty is propagated through the learned surrogate using Monte Carlo sampling to obtain prediction intervals and monitoring-relevant risk metrics such as threshold exceedance probabilities at selected locations. We outline two uncertainty workflows: (i) an ensemble strategy that trains FV-PINN models across sampled dispersion realizations, and (ii) a prospective conditional FV-PINN that takes random-field coefficients as additional inputs, enabling efficient Monte Carlo evaluation after a single training stage. The application of the methodology is demonstrated on simple benchmark examples designed to represent sparse monitoring data, showing how conservative learning and random-field uncertainty propagation can support reliable transport predictions when observations are limited.

How to cite: Jain, S., Dey, S., and Chahar, B. R.: A Conservative FV-Residual PINN Framework for Solute Transport through Subsurface Media with Dispersion Uncertainty for Data-Scarce Environments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6256, https://doi.org/10.5194/egusphere-egu26-6256, 2026.

Pharmaceuticals, ubiquitous in human, veterinary and agricultural use, are prevalent emerging contaminants in Chinese surface waters. Although not highly persistent, their low removal in conventional wastewater treatment leads to continuous discharge, creating "pseudo-persistence." This chronic exposure poses significant ecological and human health risks, including hormonal disruption of female reproduction and antibiotic-induced gut microbiota alterations and antimicrobial resistance in aquatic biota.

Numerous pharmaceuticals (>100) have been detected in China's surface waters. However, clear regulatory priorities are lacking, and nationwide monitoring is insufficient, leaving many regions without concentration or risk data. This study aims to: (1) identify pharmaceuticals posing the highest human and environmental hazards; (2) develop nationwide predictive concentration models using machine learning; and (3) generate a health risk map for pharmaceuticals in China's surface waters.

Through systematic keyword searches in Web of Science and CNKI, we compiled data from 227 peer-reviewed articles (2010-2023), covering approximately 13,000 sampling sites across China's nine major river basins. Pharmaceutical concentrations, detection frequencies, and sampling metadata were extracted. To assess environmental behavior and risks, four key indicators were selected: octanol-water distribution coefficient (LogDow) for bioaccumulation potential, degradation half-life (T1/2) for persistence, predicted no-effect concentration for aquatic ecosystems (PNECeco) for ecotoxicity, and predicted no-effect concentration for human exposure (PNEChum) through drinking water and fish consumption.

Principal component analysis (PCA) integrated four indicators into a composite hazard score (HP) and to combine concentration and detection frequency into an exposure potential score (EP). Pharmaceuticals were preliminarily screened based on reference thresholds for HP and EP values, and then ranked by the product of HP and EP to establish priority control lists for each river basin. Roxithromycin and erythromycin, exhibiting high toxicity and extensive data, ranked highest across all basins. Antibiotics were consistently high-priority in all nine basins. In densely populated basins (Haihe, Yangtze, Pearl), bezafibrate, indomethacin, and ibuprofen require additional attention. Hormones (estrone, estriol, ethinylestradiol) showed elevated concentrations and risks in Songhua/Liao basins. Increased monitoring is strongly recommended for data-scarce inland basins.

Four representative pharmaceuticals (erythromycin, ciprofloxacin, norfloxacin, carbamazepine), selected based on high toxicity or exposure potential, were modeled nationally. Predictors included 27 variables across five categories: Socioeconomic, Healthcare, Agricultural and aquacultural, Natural environmental, and Water quality indicators. Seven machine learning algorithms were evaluated (DT, ExtraTrees, GB, KNN, RF, SVM, XGBoost). RF demonstrated superior performance and was selected for feature selection (via weighted backward stepwise regression) and hyperparameter tuning (grid search with 10-fold CV). The optimal model was chosen based on R² and RMSE.

Predicted concentrations were then input into the USEPA-recommended human health risk assessment model. Carbamazepine, ciprofloxacin, and norfloxacin exhibited low risks nationwide (HQ < 1). Erythromycin exceeded safe levels (HQ > 1) in eastern regions (Yangtze River Delta, Bohai Rim, Pearl River Delta). Spatially, erythromycin and norfloxacin risks displayed a distinct east-west gradient (higher east), while carbamazepine and ciprofloxacin showed minimal spatial variation.

How to cite: Li, J.: Nationwide Prioritization and Machine Learning-Based Risk Prediction of Pharmaceuticals in China's Surface Waters, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6360, https://doi.org/10.5194/egusphere-egu26-6360, 2026.

EGU26-10106 | ECS | Orals | HS2.1.3

A Multiscale Interpretation of Memory-Driven Anomalous Sediment Transport 

Hsuan Hung Wu and Christina W Tsai

Anomalous sediment transport is often observed in turbulent flows. Under these conditions, particle motion frequently deviates from the classical Fickian diffusion assumption due to long-term correlations and complex interactions between flow and sediment. Although many models have been developed to describe this behavior, it remains challenging to link particle-scale dynamics, field-scale transport processes, and statistical descriptions of concentration distributions within a single physical framework. As a result, parameters used in statistical or fractional-order models are often obtained through empirical fitting, and their physical interpretations remain unclear.

This study presents a multiscale framework for interpreting memory-driven anomalous sediment transport by linking particle dynamics, continuum transport behavior, and statistical descriptions. At the particle scale, a Fractional Sediment Diffusion Particle Tracking Model (FSDPTM) is employed to simulate sediment motion with temporal memory. Under this setting, anomalous diffusion emerges from non-Markovian particle dynamics. The mean-square displacement (MSD) is then analyzed to quantify anomalous transport behavior at the particle scale and to describe the strength of temporal correlations.

At the macroscopic scale, transient concentration fields obtained from particle trajectories are used to guide the fractional advection–diffusion equation (FADE). This step connects the particle-scale memory effect with the field-scale Eulerian description. Since experimental observations of transient concentration evolution are often difficult to obtain, the proposed method focuses on cross-scale internal consistency rather than direct data fitting. The steady-state concentration profiles produced by the particle model are then compared with laboratory measurements to assess whether the long-term transport behavior is physically reasonable.

Building on the validated steady-state profiles, a fractional entropy formulation is used to describe the statistical structure of sediment concentration distributions. The entropy parameter is not an empirical fitting coefficient, rather, it is interpreted as a potential indicator reflecting the cumulative effects of memory-driven transport processes. By comparing the mean-square displacement (MSD) at the particle scale, the FADE parameters at the field scale, and the entropy-based description, this study demonstrates that entropy parameter may be related to anomalous transport characteristics associated with long-term particle memory.

Overall, this study presents a multiscale interpretation of anomalous sediment transport in which particle dynamics, continuum transport equations, and statistical descriptions are treated in a mutually consistent manner. The results suggest that entropy-based parameters may have the potential to serve as compact and physically interpretable indicators of anomalous transport intensity. This framework provides a structured approach for connecting transport dynamics across scales and for extracting physical insights from limited observable information.

Keywords:Anomalous diffusion;Memory-driven transport; Multiscale processes; Fractional dynamics; Particle-based modeling; Statistical characterization

How to cite: Wu, H. H. and Tsai, C. W.: A Multiscale Interpretation of Memory-Driven Anomalous Sediment Transport, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10106, https://doi.org/10.5194/egusphere-egu26-10106, 2026.

Over the past two decades, microplastics (MPs) pollution has been recognized as a significant risk to public health and to a wide range of environments, particularly riverine, estuarine, and oceanic systems. However, much of the existing research on MPs has focused primarily on large-scale transport behavior in ocean zones using deterministic approaches. Consequently, many of the underlying fundamental principles governing the transport mechanisms of MPs and their fate in open channel flows remain poorly understood. Unlike sediments, which generally settle downward, MPs exhibit far greater variability in physical properties, including material composition, shape, size, drag, and density. Some MPs are even lighter than water, leading to upward or buoyant motion during transport and introducing additional complexity to the governing hydrodynamics.

To account for the geometric irregularity of particles, this study employs a stochastic diffusion particle tracking model (SD-PTM) that incorporates a modified vertical velocity formula to better represent the effects of inertial and viscous drag forces on MPs. In this model, the movement of suspended MPs is modeled as a stochastic process composed of a drift term and a random term, to represent particle transport in open channel flow. In addition, the genetic algorithm (GA) is applied to optimize the drag coefficients, thereby enhancing model robustness under data-limited conditions.

Compared with traditional models without consideration of MPs’ physical properties, the proposed modified stochastic model investigates not only the settling motion of MPs, but also extends, for the first time, stochastic modeling approaches to buoyant particles. The model results are compared with the experimental data provided by Born et al. (2023) across a range of flow conditions to calibrate the model coefficients. This study offers a new perspective on both rising and settling MP motion, thereby advancing the understanding of microplastic fate and transport in open channel flows.

How to cite: Chen, M. T. and Tsai, C. W.: Modified Stochastic Model for Settling and Rising Microplastic Transport in Open Channel Flows, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10114, https://doi.org/10.5194/egusphere-egu26-10114, 2026.

EGU26-10246 | Orals | HS2.1.3

Three-Layer Ornstein–Uhlenbeck Model for Turbulent Flow Simulation 

Cheng Yu Chen and Christina W Tsai

This study develops a three-layer embedded Lagrangian stochastic (LS) model for simulating suspended sediment transport in open-channel flows. The model describes particle motion at three levels: position, velocity, and acceleration, using multiple Ornstein–Uhlenbeck (OU) processes within a coupled stochastic system. This construction preserves intrinsic stochasticity while allowing the velocity process to be differentiated in time to obtain particle acceleration, enabling a consistent description of particle motion at small time scales.

In conventional LS models, random forcing is typically represented by a Wiener process. Since this process is nowhere differentiable, it limits the interpretation of higher-order kinematic quantities. In this study, an embedded Ornstein–Uhlenbeck formulation is employed, where the random forcing is described by a finite-order system of coupled stochastic ordinary differential equations. Compared with conventional two-layer LS models, the three-layer formulation produces smoother Lagrangian velocity trajectories by improving the differentiability of the velocity process. This formulation reduces abrupt fluctuations in the simulated velocity signal and allows acceleration to remain finite and well-behaved.

As a result, the model provides a clearer basis for describing short-time-scale particle motion and for exploring rapid turbulent effects near the bed. Model parameters are determined based on laboratory experimental data and commonly used turbulence scaling relations reported in the literature.

Overall, the proposed framework provides a stochastic description of particle motion that allows velocity and acceleration to be consistently represented at small time scales and offers a basis for further investigation of near-bed particle behavior and suspended sediment transport processes.

How to cite: Chen, C. Y. and Tsai, C. W.: Three-Layer Ornstein–Uhlenbeck Model for Turbulent Flow Simulation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10246, https://doi.org/10.5194/egusphere-egu26-10246, 2026.

The overexploitation of groundwater has emerged as a critical environmental issue due to the increasing pressure placed on this vital freshwater resource by rapid urbanization and population growth. Understanding future groundwater availability near urban expansion is essential for sustainable urban planning and water-resource management. This study investigates the influence of land-cover change on groundwater depletion while also examining the spatial patterns of urban growth and their effects on surface thermal conditions using Land Surface Temperature (LST) and the Normalized Difference Vegetation Index (NDVI). Groundwater storage variations were monitored using data from the Gravity Recovery and Climate Experiment (GRACE), while Landsat imagery was used to derive land-cover maps, NDVI, and LST. To assess the relationship between climate variability and groundwater recharge, GRACE-derived groundwater storage anomalies were correlated with precipitation data obtained from the Global Precipitation Measurement (GPM) mission. Time-series analyses of groundwater storage and land-cover changes were conducted at five-year intervals from 1990 to 2025 to quantify the impacts of urbanization on groundwater dynamics. The results reveal a significant acceleration in groundwater depletion and urban expansion over the past decade. Concurrently, LST exhibits an increasing spatial trend that closely corresponds with declining vegetation cover and expanding built-up areas, indicating that urbanization has contributed substantially to rising surface temperatures. These findings underscore the urgent need for effective groundwater management policies and integrated urban planning strategies to ensure the long-term sustainability of freshwater resources.

How to cite: Ali, M. Z. and Benaafi, M.: Impact of Urbanization on Groundwater Storage and Surface Temperature Changes: A Case Study of Riyadh, Saudi Arabia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10496, https://doi.org/10.5194/egusphere-egu26-10496, 2026.

EGU26-11882 | ECS | Orals | HS2.1.3

Regional Annual Flow Estimation by Machine Learning Tool in QGIS for Data-Scarce Catchments 

Cristiano Guidi, Alena Seidenfaden, Philip Marzahn, and Jens Tränckner

Within the APRIORA project, an open-source, geospatial QGIS plugin was developed to support the implementation of the EU Urban Wastewater Treatment Directive in 2025 by assessing environmental risks from human pharmaceuticals. This multidisciplinary deterministic model estimates annual loads from wastewater treatment plants, distributes them spatially through river networks and calculates the Predicted Environmental Concentration (PEC) for each reach.

The practical application of the tool encountered a key limitation in data-scarce regions, where missing catchment-scale flow data and API consumption data prevented the calculation of PECs. Existing hydrological models often present barriers due to high computational demands, intensive calibration needs and strict data requirements. To bridge this gap, a new, integrated hydrological module for the QGIS plugin was developed, offering a flexible, efficient solution that operates with minimal and easily accessible geospatial inputs. In that way, the tool became applicable in data scarce catchments of the project with limited monitoring networks as Poland and Latvia.

The module consists of four tools designed to operate sequentially. The first, “Fix river network”, establishes topological contributing relationships between river sections. The second, “Contributing area of gauging station”, delineates subcatchments contributing to any available stream gauges, defining the areas for model calibration and validation. This step can be omitted in fully ungauged catchments. The third, “Calculate geofactors”, computes physiographic and climatic predictors (e.g., mean elevation, slope, share of forest and settlement area, mean annual precipitation) for each subcatchment. It is important to note that the model makes use of freely available continental-scale datasets (e.g., Copernicus DEM (30m resolution), Corine Land Use Land Cover (100m resolution) and ERA5 monthly total precipitation) thereby ensuring its applicability in regions where data is scarce. The fourth tool, “Flow estimation”, employs a machine learning approach (specifically a Random Forest Regressor) where the previously calculated geofactors act as independent variables to predict the flow measured in gauged subcatchments.

In order to guarantee its applicability in regions without local gauges, the tool allows the use of pre-calibrated, averaged model parameters derived from the project’s partner countries. This provides a transferable solution despite underlying regional hydrological uncertainties. The model estimates annual mean flow and annual mean low flow for regional river sections. This temporal resolution aligns with annual API consumption statistics and also represents the worst-case condition for pollution dilution and environmental risks.

In this presentation, we will present the tool itself as well as results from three different Baltic Sea catchments.

 

Acknowledgement - The authors thank the Interreg Baltic Sea region funding programme – co-founded by the European Union (ERDF) – and all the APRIORA project partners contributing to this work.

How to cite: Guidi, C., Seidenfaden, A., Marzahn, P., and Tränckner, J.: Regional Annual Flow Estimation by Machine Learning Tool in QGIS for Data-Scarce Catchments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11882, https://doi.org/10.5194/egusphere-egu26-11882, 2026.

EGU26-11919 | Orals | HS2.1.3

The value of data in reducing uncertainty in mountain groundwater modeling 

Alberto Bellin, Andrea Betterle, and Mariaines Di Dato

Mountain aquifers are receiving increasing attention as a key component of the so-called water towers. They sustain important freshwater ecosystems, river flow during droughts, and are a key water resource for populations living in mountain valleys and the nearby floodplains. These aquifers are exposed to emerging pollutants, such as pharmaceuticals, PFAS, and microplastics, whose adverse effects on ecosystems and human health are exacerbated by overexploitation. The interaction between surface and subsurface waters increases the risk of groundwater contamination by untreated sewage waters, and in several cases also by treated waters, because in most countries sewage treatment systems are not yet designed to remove pharmaceutical and emerging contaminants. A significant challenge that modelers face when dealing with these systems is the endemic lack of data to constrain the models, which limits their reliability in risk analysis and in the comparison of the effectiveness of alternative remediation actions.  An example of application in a mountain valley aquifer of northeastern Italy is used to discuss how to make a convenient use of available data to reduce the uncertainty affecting groundwater modeling in such environments, where lateral fluxes stemming from hillslopes and the surface/subsurface water exchange fluxes are difficult to constraint and a source of large uncertainties in modeling both groundwater availability and groundwater contaminant transport.  In particular, we explored the gain in model consistency that can be obtained by supplementing groundwater head data with geochemical and groundwater concentration data of a target contaminant at a few controlling groundwater wells. The geochemical data refer to river water and to springs emerging from the lateral hillslopes. Electrical conductivity and other geochemical data typically collected as part of the standard water quality monitoring performed by Environmental Protection Agencies may help in reducing the uncertainty in the lateral and surface/subsurface exchange fluxes and in improving the reliability of the transport model, when used in combination with contaminant concentration data at the available groundwater monitoring wells. The analysis suggests that considering the valley aquifer as part of a more complex system, including the contribution of the lateral mountain aquifers, and the exchange with surface water, is an opportunity for producing realistic models rather than an unnecessary complication.

How to cite: Bellin, A., Betterle, A., and Di Dato, M.: The value of data in reducing uncertainty in mountain groundwater modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11919, https://doi.org/10.5194/egusphere-egu26-11919, 2026.

EGU26-16044 | Orals | HS2.1.3

Deep Learning-Driven Hyperspectral Data Fusion for Real-Time Water Quality Monitoring 

Daeun Yun, Na-Hyeon Gwon, Jinyoung Jung, and Sang-Soo Baek

Water quality monitoring is essential for addressing water contamination and ensuring public safety. Particularly, managing nitrate levels has become a major concern due to their direct impact on eutrophication. Despite the high accuracy of conventional analysis methods, their practical application is often limited by high costs, labor-intensive processes, and a lack of real-time monitoring capabilities. This study presents a novel framework for real-time water quality monitoring by integrating hyperspectral and multi-sensor data through deep learning-based data fusion. The multi-sensor data includes pH, electrical conductivity (EC), dissolved oxygen (DO), and oxidation-reduction potential (ORP). A transformer-based deep learning model was applied to predict water quality concentrations by capturing correlations within time-series hyperspectral absorbance and multi-sensor data. Furthermore, transfer learning was employed to improve the performance in target domains by transferring the information contained in a pre-trained model. The data-fusion transformer model predicted water quality concentrations with high accuracy, achieving a coefficient of determination (R2) exceeding 0.99 in both deionized and tap water conditions. Specifically, the integration of multi-sensor data improved model robustness and performance compared to applying spectral data alone. This research also demonstrated that transfer learning effectively supported the model in adapting to varying flow conditions. The proposed deep learning-based data-fusion framework provides a reliable solution for real-time water quality monitoring, with aims to extend the model application to predict multiple water parameters simultaneously.

How to cite: Yun, D., Gwon, N.-H., Jung, J., and Baek, S.-S.: Deep Learning-Driven Hyperspectral Data Fusion for Real-Time Water Quality Monitoring, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16044, https://doi.org/10.5194/egusphere-egu26-16044, 2026.

Accurate representation of transport processes is essential for understanding water quality dynamics in surface flow systems, particularly under turbulent conditions where observations are limited in space and time. In such environments, sediment and sediment-associated constituent transport is strongly influenced by multiscale turbulence, intermittency, and correlated particle dynamics, processes that are not adequately captured by conventional deterministic modeling approaches.

This study presents a Lagrangian stochastic framework for modeling particle transport in turbulent flows, with particular emphasis on addressing unresolved variability and the limited availability of Eulerian observations. Particle motion, entrainment, and dispersion are formulated using multivariate and multi-layer stochastic differential equations that explicitly incorporate turbulence-induced intermittency, particle memory, and scale-dependent correlations. Near-threshold sediment entrainment is represented through physically based probabilistic criteria, enabling the modeling of intermittent transport events that dominate sediment flux in regimes close to the threshold of sediment motion.

To capture relative dispersion and correlated motion driven by multiscale turbulent structures, the framework extends beyond single-particle formulations to include two-particle stochastic dynamics. Model development and validation are informed by Direct Numerical Simulation (DNS) data, which provide flow statistics for quantifying particle position, velocity, and correlation structures. This integration allows critical transport characteristics to be inferred even when field-scale monitoring data are limited in space or time.

The proposed stochastic framework provides a physical framework for modeling the transport of particle-associated constituents in surface flows. By emphasizing process-based stochastic representations rather than data-intensive deterministic closures, the approach offers a robust pathway for advancing transport modeling in turbulent flows under data-limited conditions.

How to cite: Tsai, C.: Physically Based Lagrangian Stochastic Modeling of Particle Transport in Data-Limited Turbulent Flows , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17157, https://doi.org/10.5194/egusphere-egu26-17157, 2026.

High-precision and accurate runoff simulation is crucial for the management and allocation of water resources, the operation of hydraulic engineering, and the prevention of flood and drought disasters. However, there is currently no consensus on how to effectively filter and reshape the impact of numerous external factors influencing runoff, and also there is a lack of sufficient theoretical support. To maximize the metrics accuracy of the result of runoff simulation and better capture the internal hydrological characteristics of runoff, the concept of granular computing from the field of artificial intelligence was drawn on, terrain factors were extracted and their attribute features were optimal-selected based on granulation rules, and a Long Short-Term Memory (LSTM) model incorporating the climate characteristic index (LSTM-new) was developed based on delineated sub-region areas in this study. Finally, a unidirectional feedback framework was proposed, combining process-driven method based on the Variable Infiltration Capacity (VIC) model with a data-driven method using the established LSTM (CopulingVIC-new), to enhance the hydrological process characteristics of the simulated runoff and improve simulation accuracy. The results showed that the average NSE, R2, KGE, and RMSE of CopulingVIC-new during training, validation, and testing periods achieved 0.93, 0.92, 0.91, and 334.86 m3/s, respectively, which increased by 7.29%、2.97%、9.73%、-19.41% and 13.41%, 12.19%, 19.73%, -46.95% compared to uncoupled LSTM and VIC. Additionally, the proposed framework effectively captured the interannual variation trend of runoff in all seasons except late spring and summer, though it also overestimated the risk of the occurrence of annual maximum daily peak flow (AMDPF) and total flood volume of annual continuous maximum 5-day (TFAM5D) and thier joint variables. The overall results indicated that the scheme of introducing climate characteristic index, based on sub-region division, can more accurately capture extreme runoff in the study area, as well as the variation of seasonal runoff on both intra-annual and interannual scales. Although CouplingVIC-new still had limited ability to capture extreme flow, the structure of extreme value of the output runoff became more robust after unidirectional coupling. This research can help to expand the application of machine learning in hydrological modelling and provide a useful reference for related studies.

How to cite: Zhao, Y.: Runoff simulation based on granular computing by introducing terrain factors to construct climate characteristic index, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18485, https://doi.org/10.5194/egusphere-egu26-18485, 2026.

EGU26-19227 | ECS | Orals | HS2.1.3

GIS-Based Assessment of Karstification Potential in Siargao Island, Philippines 

Riva Karyl Varela, Ed Dwight Barrios, Friendylle Bondad, and Yleiah Ann Cortejos

Karstic terrains are formed by the dissolution of carbonate rocks and are essential zones for groundwater reservoirs but are susceptible to geological and climatic conditions. Thus, delineation and characterization of potential karst development sites are necessary, especially in areas with limited data on karst development, which hinders accurate groundwater assessment, hazard mitigation, and sustainable land-use planning, especially in remote areas such as Siargao Island, Philippines. By applying a data-driven geospatial framework that combines statistical analysis with Geographic Information System (GIS) techniques, it is possible to evaluate the island’s karstification potential as support for future water resource management strategies.

Principal Component Analysis (PCA) was applied to eight initially selected variables, which were then reduced to four key components: geology, slope, precipitation, and vegetation. These components were used for GIS-based multi-criteria evaluation to generate a karst potential map of Siargao Island. Results show strong spatial variability in karst development wherein high to very high potential zones are in the southern and southeastern regions, characterized by mature cockpit karsts, caves, and sinkholes. The eastern and western parts of the island, where transitional stages of karst development are present, exhibit moderate karstification potential. Non-carbonate areas with minimal karst expression in the central and northern regions showed low to very low potential zones. Field observations, existing geomorphological maps, and sinkhole inventory data were utilized for model validation, resulting in an overall accuracy of 80.6% and a Kappa coefficient of 0.44, indicating moderate agreement between the predicted and observed karst features.

Through this approach, a cost-effective monitoring strategy for assessing groundwater resources and geohazards in data-scarce, remote areas with karstic terrains, such as Siargao Island, can be developed. The generated karst potential map provides a baseline for sustainable water resource management, groundwater protection, and land-use planning. Furthermore, this study demonstrates the use of geospatial and decision-support methods to strengthen hydrological management in remote environments.

How to cite: Varela, R. K., Barrios, E. D., Bondad, F., and Cortejos, Y. A.: GIS-Based Assessment of Karstification Potential in Siargao Island, Philippines, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19227, https://doi.org/10.5194/egusphere-egu26-19227, 2026.

EGU26-19960 | ECS | Posters on site | HS2.1.3

Assessing urban water access in African cities: a GIS clustering approach in Malabo, Equatorial Guinea 

Manuel Rodríguez del Rosario, Severo Meñe Nsue-Mikue, Víctor Gómez-Escalonilla, Esperanza Montero-González, Silvia Díaz-Alcaide, and Pedro Martínez-Santos

Access to safe drinking water remains a daily challenge for millions of urban residents around the world, particularly in sub-Saharan Africa. This study provides a detailed assessment of inequalities in the realization of the human right to water in urban neighborhoods in Malabo, Equatorial Guinea. Clustering techniques combined with GIS analysis were used to map and assess access to water throughout the study area. The clustering results were compiled into a matrix assessing six key indicators: the physical availability of improved water sources; transport time; water quality; water quantity; reliability; and affordability. More than 500 household surveys were conducted and over 200 water points were sampled for this work. The results indicate that access to water is severely limited by poor quality, insufficient quantity and an unreliable supply. Below 3% of households meet the standard for safely managed drinking water, and less than 22% have at least basic access, which contrasts sharply with official statistics. Considering these results in the context of current literature highlights the importance of taking all relevant factors into account when making reliable estimates of water access. Current rates of access to this resource tend to be significantly lower than reported, and despite global progress, humanity is still far from fulfilling the fundamental human right to water. These findings emphasise the urgent need for targeted interventions to address inequalities and enhance the water supply in urban areas.

How to cite: Rodríguez del Rosario, M., Nsue-Mikue, S. M., Gómez-Escalonilla, V., Montero-González, E., Díaz-Alcaide, S., and Martínez-Santos, P.: Assessing urban water access in African cities: a GIS clustering approach in Malabo, Equatorial Guinea, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19960, https://doi.org/10.5194/egusphere-egu26-19960, 2026.

EGU26-21064 | Posters on site | HS2.1.3

Ice-Regulated Water Quality Dynamics in Finnish Shallow Lakes: A Machine-Learning Reconstruction 

Shahin Nourinezhad, Nasim Fazel, Heini Postila, and Ali Torabi Haghighi

Water quality in ice-covered lakes is strongly affected by winter physical conditions, particularly in shallow systems where ice cover influences mixing, oxygen availability, light conditions, and biogeochemical processes. Changes in ice thickness and duration can therefore have substantial impacts on key water quality parameters, including dissolved oxygen and nutrient dynamics. However, long-term observations of both water quality and ice conditions are sparse and unevenly distributed across Finnish lakes, limiting comprehensive assessments. In this study, we apply a machine-learning approach based on the gradient boosting algorithm to model water quality and ice conditions on shallow lakes in Finland over the period 1965–2024. The model demonstrates strong predictive performance, evaluated using the root mean square error (RMSE), enabling the reconstruction of water quality dynamics under data-scarce conditions.

How to cite: Nourinezhad, S., Fazel, N., Postila, H., and Torabi Haghighi, A.: Ice-Regulated Water Quality Dynamics in Finnish Shallow Lakes: A Machine-Learning Reconstruction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21064, https://doi.org/10.5194/egusphere-egu26-21064, 2026.

EGU26-21229 | ECS | Orals | HS2.1.3

Scenario-based 1D hydrodynamic modelling of glacial lake outburst floods in the Western Indian Himalaya 

Nikhil Mishra, Ashok K. Keshari, and Bhagu Ram Chahar

Glacial Lake Outburst Floods (GLOFs) are emerging as a significant hazard in high-mountain regions due to accelerated glacier retreat and lake expansion resulting from climate warming. The present study employs a one-dimensional hydrodynamic modelling framework to simulate the propagation of GLOF and downstream flood characteristics for the Gepan Gath Lake–Chandra Basin in the Western Indian Himalayas. The selected study area represents one of the most rapidly evolving and hazard-prone glacial lake settings in the region. Unsteady flow simulations are performed using the HEC-RAS hydraulic model to route scenario-based GLOF hydrographs along the downstream river corridor. Breach outflow hydrographs have been generated using plausible combinations of lake volume and dam failure mechanisms, and are applied as upstream boundary conditions. River geometry is represented through cross-sections extracted from the ALOS PALSAR digital elevation model and supporting geospatial datasets. The simulations capture the temporal and spatial evolution of discharge and water surface elevation along the river network under multiple GLOF scenarios. Results indicate rapid flood wave propagation in steep upstream reaches, followed by attenuation and lateral spreading in wider downstream valleys. Peak discharge, inundation depth, and flood arrival time exhibit strong spatial variability, primarily governed by valley morphology and hydraulic connectivity. The modelling outputs enable identification of critical downstream impact zones and provide first-order estimates of exposure to GLOF hazards. This study demonstrates that 1D hydrodynamic modeling using HEC-RAS, combined with remotely sensed terrain data, provides an efficient and robust approach for regional-scale GLOF hazard assessment, supporting the design of early warning systems and disaster risk reduction planning in data-scarce Himalayan environments.

How to cite: Mishra, N., Keshari, A. K., and Chahar, B. R.: Scenario-based 1D hydrodynamic modelling of glacial lake outburst floods in the Western Indian Himalaya, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21229, https://doi.org/10.5194/egusphere-egu26-21229, 2026.

Ecosystem services (ESs) represent the essential ecological contributions that support human well-being and socioeconomic subsistence. This study employs multi-temporal remote sensing (RS) datasets from 1995 - 2022 to quantify the Ecosystem Service Value (ESV) of key ecosystem functions within a representative Tier-2 Indian city. Land Use/Land Cover (LULC) classification is performed using a Random Forest (RF) supervised machine learning algorithm to delineate ecosystem units, producing high-precision classification results with strong overall accuracy and optimized Kappa coefficients. Valuation is conducted using benefit transfer methods, with values expressed in million US dollars per year. The results indicate that, after vegetative cover, built-up areas, croplands, waterbodies, and barren land are the next major contributors to the total ESV. The key findings of the study are that Vishakapatnam, Tier-2 city in India is highly sensitive to LULC transitions, where rapid urbanization significantly alters the trajectory of provisioning, supporting, regulatory, and cultural ecosystem services. In addition, the study examines spatio-temporal relationships between ecosystem service trade-offs and synergies, demonstrating that high-resolution ESV mapping serves as a reliable diagnostic tool for assessing the impacts of human overexploitation and poor resource management. Overall, the study provides a robust quantitative framework for ecological valuation, offering a critical foundation for evidence-based policy interventions and sustainable urban planning in rapidly transforming urban environments.

How to cite: Agrahari, S., Swetha , D., and Pal, M.: Spatiotemporal Assessment of Ecosystem Services in a Tier-II Indian City: A Case Study of Visakhapatnam (1995–2022), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21599, https://doi.org/10.5194/egusphere-egu26-21599, 2026.

Spectral sensors have become an integral part of modern precision agriculture. It enables fast, non-destructive and map-based crop monitoring of key crop physiological parameters. The spectral resolution of a sensor plays an important role in determining its ability to detect subtle changes in the crop, nutrient status and canopy development. Sensor’s comparison based on spectral resolution remains limited particularly in the context of field-level agronomic monitoring. This study aims to address this gap by using three sensors UAV-based multispectral (MS), UAV-based hyperspectral (HS) and handheld Greenseeker (GS) NDVI measurements. Hyperspectral sensors provide continuous high-resolution data from visible to near infrared; MS use fewer broad bands; GS limits to two bands for quick NDVI field checks. The experimental study was conducted in the arid region of Uttar Pradesh, India. The experimental setup consisted of plots with same irrigation (100% ETc) and varying nitrogen dosage i.e. 150,120 and 90 kg/ha (Plot 1, Plot 2 and Plot 3, respectively) with three replications. Plots 4 and 5, representing farmer-field conditions with 120 kg ha⁻¹ nitrogen and no nitrogen respectively, followed regional irrigation practices, whereas Plot 6 (rainfed) was irrigated only once initially. A series of UAV-flights were conducted across critical phenological stages, and the reflectance was used to generate Normalized Difference Vegetation Index (NDVI) representing canopy density.

The results showed that NDVI rapidly increased during early vegetative stage (61-75) DAS, saturated around (75-85) DAS, followed by a decline during (101-117) DAS.  NDVI peaked around flowering stage for all the sensors. GS-NDVI varied between (0.46-0.78), MS-NDVI displayed (0.52-0.86), whereas HS- NDVI varied between (0.55-0.90). The mean NDVI values were (0.570 ± 0.085) for GS, (0.608 ± 0.075) for MS, and (0.664 ± 0.087) for the HS, with HS exceeding others by 16.5% (vs. GS) and 9.3% (vs. MS). Pearson correlation coefficients confirm strong inter-sensor agreement: Greenseeker-Hyperspectral r = 0.96, Multispectral-Hyperspectral r = 0.91, Greenseeker-Multispectral r = 0.87 (all p<0.001), indicating consistent vegetation health trends despite spectral resolution variances. Across days 61-117, fully irrigated plots with varying nitrogen dosage (Plot 1-3) maintained higher vegetation indices (0.60-0.90) than stressed plots. For Plot 4 (0.57-0.84), Plot 5(0.49-0.79) and Plot 6(0.46-0.76), the decline accelerated under water and nitrogen deficit. Water-stressed and nitrogen deficit plots show greater NDVI drops, indicating higher stress levels leading to early senescence, thus affecting the grain yield.

Overall, the three sensors show strong agreement in NDVI trends. For precision agriculture, HS optimized subtle changes, followed by MS; statistical trends aligned with established NDVI comparison protocols using correlation and regression. Hyperspectral sensor offered the highest diagnostic capability, multispectral provided spatial characteristics and greenseeker served as an efficient tool for rapid monitoring of field. These combined observations emphasize the importance of selecting sensors based on the required level of detail, operational constraints, and monitoring objectives in precision agriculture. Integrating data from multiple sensor types can further enhance crop assessment accuracy and support more informed decision-making in precision agriculture.

How to cite: Adwait, A., Upreti, H., and Singhal, G. D.: Evaluating Spectral Resolution Effects on Crop Monitoring: A Comparison of UAV-based Multispectral, Hyperspectral and handheld Greenseeker sensor, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-743, https://doi.org/10.5194/egusphere-egu26-743, 2026.

Canopy cover (CC) reflects canopy density, leaf area development, and early stress conditions acts as a significant indicator for crop health. Accurate CC estimation helps in mapping spatial variability in crops and facilitates early detection of disease or stress due to nutrient or water deficiency. For estimation of canopy cover, UAV multispectral data was acquired at different crop growth stages. This study estimated wheat canopy cover percentage from tillering to dough stage using Random forest classifier and MSAVI index thresholding for more accurate and robust assessment of canopy dynamics. Supervised classification approach was used based on given training samples for three different classes, i.e., soil, canopy and shadow and classification was performed through Random forest (RF) algorithm. The extracted canopy pixels were then used for finding canopy cover percentage. Additionally, a simplified approach was used based on MSAVI index thresholds to identify crop pixels, enabling reliable CC estimation through vegetation index segmentation method. The experiment was conducted on wheat crop using three ETc (Crop evapotranspiration) based irrigation treatments i.e., 100%, 80%, and 60% ETc and each treatment had three replications. In addition to ETc based treatments, the Farmer’s and rainfed treatments were also considered. The rainfed treatments with two replications, received a single life-saving irrigation and farmer treatments, with three replications were irrigated based on local farmer’s practice.

Canopy cover percentage observed across different growth stages (40 to 114 DAS) showed distinct variation in crop development among varying irrigation treatments. In the treatments with 100%, 80%, and 60% ETc irrigation, RF based CC ranged from 35.3–98.5%, 36.1–97.9%, and 29.2–95.2%, while MSAVI-based CC ranged from 33.8–96.5%, 34.2–95.9%, and 28.1–94.5%, respectively. In comparison to ETc treatments, farmers treatment exhibited lower canopy cover, with ranges of 28.6–95.6% (RF) and 28.9–92.8% (MSAVI). Rainfed treatment recorded the lowest CC values across the growing season, varying between 23.1–72.4% using RF and 25.3–69.7% using MSAVI. Canopy cover estimates from the Random Forest algorithm and the MSAVI index showed consistent seasonal patterns, with RF generally producing slightly higher CC values. The NDVI patterns were also observed for all stages to validate these findings and the values ranged from 0.29–0.89, 0.26–0.88, and 0.24–0.85 in 100%, 80%, and 60% ETc treatments, respectively. Rainfed (0.22–0.74) and Farmer’s treatments (0.26–0.81) had lower NDVI values, supported the CC trends observed with RF and MSAVI methods. The highest CC and NDVI values were obtained around flowering stage i.e., (85-95) DAS and the lowest at tillering stages for all treatments, followed by a gradual decline after the flowering stage as the crop progressed toward maturity. Canopy cover trends were comparable in the 100% and 80% ETc treatments, whereas CC in 60% ETc treatment remained lower at all stages, indicating the impact of water deficit on canopy growth.

The study highlights that MSAVI based vegetation-index methods can provide a reliable and highly efficient pathway for estimating canopy cover, reducing the need for extensive training datasets and complex classification models.

How to cite: Yadav, A., Upreti, H., and Singhal, G. D.: Assessment of UAV based Canopy Cover for Varying Irrigation Treatments using Random Forest Classifier and MSAVI Index Thresholding, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-773, https://doi.org/10.5194/egusphere-egu26-773, 2026.

EGU26-1114 | ECS | Orals | HS6.9

CropLizer: An Agro-Socio-Edapho-Climatological Tool for Rice Nutrient Management and Profitability Assessment 

Mukund Narayanan, Ankit Sharma, and Idhayachandhiran Ilampooranan

Smallholder farmers frequently rely on thumb rules assuming that higher fertilizer inputs guarantee higher yields due to the absence of site-specific edapho-climatological data. This dependence on generalized rules creates a disconnect between site-specific requirements and field management practices, necessitating modeling field dynamics and providing actionable advisories to farmers. To address this disconnect, this study developed ‘CropLizer’ a machine learning and remote sensing based tool (https://mukundn1997-croplizer.hf.space/) to function as an integrated decision support system for rice cultivation. To develop CropLizer, this study synthesized a comprehensive dataset comprising over 45,000 rice field points (60% was reserved for training and the rest for validation) integrated with broadly yields, irrigation, nutrient practices, social status (education and ethnic group), climatic variables (precipitation), soil quality variables (carbon, nitrogen, and bulk density), as well as market accessibility. Subsequently, seven models (linear, support vector, decision tree, random forest, neural network, LSTM, transformer) were trained and hyperparameter tuned to predict yield and fertilizer requirements based on 43 agro-edapho-socio-climatological variables through ‘sklearn’, ‘tensorflow’, and ‘optuna’ libraries in python using IIT Roorkee’s super computer PARAMGANGA. After optimization, a web application was developed to allow users to simulate different scenarios by adjusting specific farming inputs to identify optimal management practices. Consequently, the system generates prescriptions for nitrogen and phosphorus and potassium application rates based on the predicted yields. Moreover, a user could find the potential yield for their field and what adjustments in field practices are required to obtain the potential yield sustainably (without loss of soil carbon). Considering the practical difficulties of gathering meteorological record and soil data, an application programme interface was set up for automatic retrieval of these variables from the field coordinates from open-meteo and soilgrids datasets. Upon validation, the performance of the best performing model (random forest) demonstrated a satisfactory accuracy (65%). Beyond agronomic parameters the tool calculates economic viability by integrating local market prices to estimate potential net profit margins and benefit-cost ratios under current yield and potential yield.  This framework bridges the gap between scientific research and field application by providing assured predictions for pre-season planning to mitigate financial risks.

How to cite: Narayanan, M., Sharma, A., and Ilampooranan, I.: CropLizer: An Agro-Socio-Edapho-Climatological Tool for Rice Nutrient Management and Profitability Assessment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1114, https://doi.org/10.5194/egusphere-egu26-1114, 2026.

EGU26-2054 | ECS | Posters on site | HS6.9

Real-Time UAV-Deep Learning System for Citrus Orchard Structure and Yield Assessment 

Khaoula Bakas, Amine Saddik, Azzedine Dliou, Mohammed Hssaisoune, Said El Hachemy, Hamza Ait Ichou, Fatima Hmache, Mohammed El Hafyani, Adnane Labbaci, and Lhoussaine Bouchaou

Arid and semi-arid regions are facing more frequent and severe droughts, with annual rainfall often below 200 mm. Large-scale, intensive irrigation further strains these limited water resources. Under these conditions, growers need practical tools to estimate yield and monitor tree health at high spatial detail so they can better manage irrigation and inputs. This work develops and tests an automated, data-driven pipeline for estimating citrus yield at the individual-tree level using UAV imagery and Deep Learning. The pipeline comprises three main components. First, individual trees and orchard rows are segmented using a lightweight Tiny U‑Net model. Second, a CNN-based model predicts tree-level yield from vegetation indices and field measurements. Third, these predictions are validated through detailed fruit sampling.

The study was conducted in a commercial citrus orchard in a semi-arid region under climate and water stress. High‑resolution UAV imagery was processed into orthomosaics and vegetation index maps, and the Tiny U‑Net was optimized for fast, near real‑time semantic segmentation, enabling precise tree crown delineation and accurate tree and row counts. For yield prediction, the CNN model exploited spatial features from vegetation indices combined with in‑situ data. The validation relied on direct comparison between UAV‑based yield estimates and yields obtained from field sampling and laboratory weighing. Both mean and median yields per tree were computed to capture tree‑level variability. The final dataset, consisting of 34 trees and approximately 340 fruit samples, provided a robust basis for assessing model performance. The Tiny U‑Net segmentation model reached high accuracy, with precision and recall of 94.74% and 94.88%, and an inference time of 12.55 ms per image tile. This shows the model is suitable for real‑time or on‑board use and can reliably map orchard structure at large scale. Tree and row counts derived from the segmentation achieved an R² greater than 0.99, confirming the robustness of the approach. For yield estimation, the CNN model outperformed other machine learning methods, achieving an R² of 0.88 at tree level. Field validation confirmed the practical usefulness of the pipeline, UAV‑predicted yields closely matched ground‑truth values, with both indicating an average yield of roughly 50 kg per tree. Most trees fell between 40 and 70 kg, and the model’s output histogram mean 50.9 kg, and median 51.4 kg aligned well with these field observations.

This robust agreement between model outputs and independent field validation data underscores the system's reliability and operational readiness for accurate, tree-level yield mapping. By integrating precise tree segmentation, high-resolution vegetation indices, and rigorously collected ground truth measurements, this study demonstrates that automated yield maps can be produced with sufficient accuracy to support operational decisions in orchards. This offers a cost-effective and scalable tool for precision agriculture, enabling optimized resource allocation, improved harvest planning, and adaptive management under increasing climate stress.

How to cite: Bakas, K., Saddik, A., Dliou, A., Hssaisoune, M., El Hachemy, S., Ait Ichou, H., Hmache, F., El Hafyani, M., Labbaci, A., and Bouchaou, L.: Real-Time UAV-Deep Learning System for Citrus Orchard Structure and Yield Assessment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2054, https://doi.org/10.5194/egusphere-egu26-2054, 2026.

Global climate change has led to more frequent and severe droughts in the middle and lower reaches of the Yangtze River, intensifying the spatiotemporal variability of crop yields in this region. Winter rapeseed, a major oilseed crop in China, is particularly vulnerable to these drought conditions, which now pose greater risks to local food security. Accurate and timely regional yield predictions are increasingly important for effective agricultural management and disaster response. However, predicting rapeseed yield at the city level is challenging due to complex climate patterns and the strengthened impact of drought. Addressing these challenges requires the integration of multi-source data, including both remote sensing and weather data, to capture the full range of environmental influences on crop growth. Traditional statistical and machine learning methods have often proven inadequate for robust, transferable yield prediction across different regions and years.

This study presents a deep learning–based yield prediction framework that integrates multi-temporal remote sensing indicators and meteorological variables to estimate winter rapeseed yield under both normal and drought conditions. Using data from 2014 to 2023 for the middle and lower reaches of the Yangtze River, an Attention–Long Short-Term Memory (Attention-LSTM) model was developed by jointly incorporating time-series remote sensing indices, meteorological factors, and statistical yield records. Key phenological periods for yield estimation were identified through multi-temporal and multi-variable combinations, and input configurations were systematically optimized. The proposed framework outperformed LSTM, Random Forest, and Support Vector Regression models, achieving an R2 of 0.81 and RMSE of 306.73 kg/ha on the validation dataset. Spatiotemporal yield dynamics and regional applicability were further analyzed, and the model’s robustness and adaptability were assessed under drought conditions. Under drought scenarios, the model maintained high accuracy, with an R2 of 0.76 and RMSE of 358.32 kg/ha. These results indicate the framework’s potential for drought-resilient yield prediction and its value for agricultural management and drought assessment under future climate change.

How to cite: Liu, S., Dong, S., and Guan, Q.: Iintegrating deep learning and multi-source datasets for drought-resilient winter rapeseed yield prediction in the Yangtze River Basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2352, https://doi.org/10.5194/egusphere-egu26-2352, 2026.

EGU26-2540 | ECS | Posters on site | HS6.9

Towards an Optimal Method for Assessing the Spatial and Temporal Hydrological Dynamics of a Keyline System 

Maurus Nathanael Villiger, Anna Leuteriz, Andrea Carminati, and Manfred Stähli

Climate change will heavily impact agriculture through alterations of precipitation dynamics which leads to more frequent agroecological droughts and intense precipitation events. Strategies to adapt to these changes are necessary to maintain food safety and sustain livelihoods in the agricultural sector. One method for farmers to mitigate the impacts from climate change is the Keylines design which can be described as open ditches parallel to the elevation line. These are designed to retain runoff, reduce erosion and increase infiltration which should lead to a higher amount of water available to plants during dry periods (e.g., Ponce-Rodríguez et al. 2021). However, scientific research and corresponding data regarding Keyline systems and their influence on field hydrological dynamics is sparse.


To quantify the hydrological impact of Keyline systems, a comprehensive field experiment has been set up combining Keyline systems with agroforest on two agricultural fields, one of which is located in the eastern Jura-range and one outside of Zurich. The goal of this study is to assess the optimal integration of tools to investigate how the soil moisture patterns are altered by Keyline systems and quantify the timing and amount of water retained. The work presented here shows the first results of a comparison between different soil moisture analysis methods applied to agricultural fields, including (a) soil hydrological modelling, (b) electric resistivity tomography, (c) UAV-based L-band radiometry, (d) in-situ soil matrix potential and volumetric water content measurements and (e) destructive gravimetric water content measurements. Several of these methods currently undergo rapid developments due to the technological advancements made in recent years, leading to an increased accessibility for a broader range of users (e.g. Du 2020; Zhou et al. 2025). This highlights the need to assess the tools regularly to showcase possible applications and directions for further development. The results presented here demonstrate the capabilities as well as the limitations of the individual methods and shows how the different systems can be used complementary with each other to obtain a complete assessment of the soil hydrological dynamics. This will help researchers investigating soil moisture dynamics to make informed choices regarding their research tools for the assessment of nature-based solutions to adapt to climate change impacts within but also beyond agriculture.


Literature:

Du, C. (2020). Comparison of the performance of 22 models describing soil water retention curves from saturation to oven dryness. Vadose Zone Journal, 19(1), e20072. https://doi.org/10.1002/vzj2.20072.

Ponce-Rodríguez, M. D. C., Carrete-Carreón, F. O., Núñez-Fernández, G. A., Muñoz-Ramos, J. de J., & Pérez-López, M. E. (2021): Keyline in bean crop (Phaseolus vulgaris l.) for soil and water conservation. Sustainability, 13(17), 9982. https://doi.org/10.3390/su13179982.

Zhou, Y., Schwank, M., Boutin, J., Richaume, P., Mialon, A., Holmberg, M., Kalescke, L. Zeiger, P., Leduc-Leballeur, M., ... , Kerr, Y. (2025, in review): Setellite Microwave Radiometry at L-band for Monitoring Earth’s Essential Climate Variables. IEEE Geoscience and Remote Sensing Magazine.

How to cite: Villiger, M. N., Leuteriz, A., Carminati, A., and Stähli, M.: Towards an Optimal Method for Assessing the Spatial and Temporal Hydrological Dynamics of a Keyline System, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2540, https://doi.org/10.5194/egusphere-egu26-2540, 2026.

Quantification of crop evapotranspiration (ET) and yield is essential for precision agricultural water management and food security, particularly over long temporal and large regional scales. In this study, we combined a water-carbon coupled model with a GPP-driven crop growth simulation method utilizing remote sensing datasets, to simultaneously estimate ET and yield over the past two decades in the North China Plain. The developed model was tested for two major crops (winter wheat and summer maize) using approximately 20 site-years of observations. For wheat, the root mean square error (RMSE) values of ET and gross primary production (GPP) were 0.57 mm d-1 and 1.65 gC m-2 d-1, and for maize were 0.80 mm d-1 and 2.92 gC m-2 d-1, respectively. Besides, the crop growth simulation agreed well with measurements that R2 values were mostly larger than 0.66, and the RMSE of yield was 554.7 for wheat and 1346.6 kg hm-2 for maize, respectively. The results revealed an increasing trend in the crop water productivity (WP = yield/ET) of wheat, while maize maintained an overall higher WP than wheat during 2001-2018. In addition, the impacts of climate change and human management on the spatiotemporal dynamics of ET-GPP fluxes over the agroecosystems were evaluated. The significantly increased GPP rather than ET dominated the significant increase in water use efficiency (WUE=GPP/ET) in the NCP, accounting for 38.6% of its cropland area. The temporal dynamic of regional mean WUE indicated a significantly increased rate of 0.026 gC kg-1H2O per year during 2001-2018. The experimental simulations demonstrated that agricultural management dominated the interannual trend of WUE, with a relative contribution of 79.5%, which was obviously larger than that of atmospheric CO2 concentration (40.2%) and changes in climate variables (-19.7%). The effects of agricultural management on WUE were further disaggregated across the classified six cropping systems, and 82.4% could be attributed to the management of winter wheat-summer maize rotation system. The remote sensing-based model developed in this study effectively quantifies regional ET and yield for two typical crops, providing critical information for smart agricultural water management. The analysis of agroecosystem WUE under changing environments underscores the dominant role of agricultural management and offers insights for climate adaption in agriculture.

How to cite: Wang, X., Lei, H., and Huo, Z.: Coupled estimation of crop evapotranspiration-yield and assessment of water use efficiency in the North China Plain through a remote sensing-based model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3713, https://doi.org/10.5194/egusphere-egu26-3713, 2026.

EGU26-4845 | Orals | HS6.9

High-resolution long-term mapping of major crops using satellite data 

Qiongyan Peng, Ruoque Shen, Yangyang Fu, Jie Dong, Baihong Pan, Yi Zheng, Xuebing Chen, Shaoping Li, Xiangqian Li, and Wenping Yuan

Winter wheat, maize, rice, and sugarcane are among the most important crops for global food security and bioenergy production. However, consistent high-resolution crop distribution maps across large regions and long time periods remain limited. In this study, we developed crop-specific identification algorithms that integrate spectral and phenological characteristics derived from satellite observations. Using these methods, we generated high-resolution (≤30 m) distribution maps for winter wheat, maize, rice, and sugarcane in China from 2001 to 2024. In addition, we produced sugarcane maps for Brazil (2016–2019), global winter cereal maps (2017–2022), and rice maps across Asia (1990–2023). Validation against independent samples shows that producer’s and user’s accuracies for winter wheat, maize, and rice in China reached 89.3% and 90.6%, 76.2% and 81.6%, and 88.4% and 89.1%, respectively. The global winter cereal maps achieved producer’s and user’s accuracies of 81.1% and 87.9%, while overall accuracies for sugarcane exceeded 91% in both China and Brazil. Estimated crop planting areas exhibit strong agreement with official statistics across regions. The resulting datasets provide consistent, long-term, and high-resolution crop distribution information, offering valuable support for crop monitoring, food security assessment, and climate and land-use change studies.

How to cite: Peng, Q., Shen, R., Fu, Y., Dong, J., Pan, B., Zheng, Y., Chen, X., Li, S., Li, X., and Yuan, W.: High-resolution long-term mapping of major crops using satellite data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4845, https://doi.org/10.5194/egusphere-egu26-4845, 2026.

EGU26-6918 | ECS | Orals | HS6.9

Drought-induced early alterations in photosynthetic efficiency revealed by convergence of spectral and molecular evidence 

Kaihao Cheng, Congjia Chen, Kejing Fan, Hon-Ming Lam, and Jin Wu

Mounting climate volatility, characterized by increasingly frequent and severe events, poses a critical threat to global food security. Traditional irrigation methods, which react only to visible drought symptoms, often fail to prevent irreversible physiological damage to crops. This underscores the need for precise, early detection of sub-lethal plant stress—a core challenge for precision agriculture. Effective early warning would enable proactive, smart irrigation, optimizing water use while protecting crop yields in a changing climate. Current drought assessment methods face significant trade-offs. Direct physiological measurements, though accurate, are destructive and impractical for field-scale use. Hyperspectral imaging (HSI) offers a non-destructive alternative by capturing detailed reflectance spectra. While it has identified signatures of advanced drought stress, a critical gap still remains, reliably predicting the initial metabolic perturbations that precede visible decline, particularly the early drop in net photosynthetic assimilation (An) which is a sensitive indicator of plant metabolic function and stress tolerance.

Our research directly addresses this need. Through controlled drought experiments of model plant Arabidopsis thaliana, we simultaneously collected high-resolution HSI data, transcriptome profiles, and ground-truth An measurements. A partial least squares regression model trained on spectral features accurately predicted An values two days in advance. Feature analysis identified wavelengths near 700 nm within the red-edge and near-infrared transition, as optimal early predictors. Strikingly, transcriptome data revealed a concurrent increase in gene activity linked to red and far-red light response pathways in drought-stressed plants. This convergence of spectral and molecular evidence indicates that early drought-induced photosynthetic alterations, predictive of An decline, manifest in canopy reflectance at ~700 nm and are underpinned by specific light-responsive molecular changes. By integrating hyperspectral phenotyping with mechanistic transcriptomics, we bridge prediction and biological causality, transforming HSI from a correlative tool into a mechanistically grounded early-warning system. This approach enables proactive, physiologically informed water management, paving the way for more climate-resilient agriculture.

How to cite: Cheng, K., Chen, C., Fan, K., Lam, H.-M., and Wu, J.: Drought-induced early alterations in photosynthetic efficiency revealed by convergence of spectral and molecular evidence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6918, https://doi.org/10.5194/egusphere-egu26-6918, 2026.

EGU26-12062 | Orals | HS6.9

A GIS-Based Framework for the Spatial Design and Discrete Optimization of Drip Irrigation Subunits 

Miguel Ángel Campo-Bescós, Iñigo Barberena, and Javier Casalí

The sustainability of global agricultural systems is increasingly dependent on the precision and efficiency of water distribution networks. In regions facing water scarcity, the design of irrigation subunits is a critical factor; however, the complexity of irregular field geometries often leads to designs based on manual approximations that ignore the full potential of hydraulic and economic optimization. This research introduces a sophisticated computational approach that integrates spatial network generation with advanced diameter optimization within a unified geographic information environment.

The core of this methodology lies in its ability to simultaneously address two fundamental aspects of irrigation engineering: the automated spatial layout of the pipe network and the discrete optimization of pipe diameters. By leveraging a high-precision hydraulic simulation engine, a genetic algorithm evaluates multiple potential configurations to identify the most cost-effective solution that satisfies pressure uniformity and flow requirements. This dual-integrated approach replaces traditional fragmented workflows, where layout design and hydraulic dimensioning are often performed in separate, disconnected steps.

The framework’s performance was validated through a practical application. This case study demonstrates how the system processes complex topographical data and irregular field boundaries to generate a complete infrastructure plan. The results indicate that the automated selection of commercial diameters, combined with an optimized spatial distribution of laterals and manifolds, leads to a significant reduction in total investment costs compared to conventional engineering methods.

By streamlining the transition from raw geospatial data to a fully optimized hydraulic network, this work provides a robust decision-support tool for precision agriculture. It offers a scalable and adaptable solution that enhances the efficiency of irrigation projects, supporting long-term water conservation goals and improving the economic viability of modern farming practices in the face of a changing climate.

How to cite: Campo-Bescós, M. Á., Barberena, I., and Casalí, J.: A GIS-Based Framework for the Spatial Design and Discrete Optimization of Drip Irrigation Subunits, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12062, https://doi.org/10.5194/egusphere-egu26-12062, 2026.

EGU26-13372 | ECS | Orals | HS6.9

A hybrid physics-artificial intelligence approach for accurate prediction of reference evapotranspiration 

Jawad Zlaiga, Amine Rghioui, Said Elhachemy, Mustapha Elyaqouti, and Salwa Belaqziz

In a context marked by water scarcity as is the case in Morocco - particularly in semi-arid regions most exposed to water challenges such as the Souss Massa region - cultivation under cover exerts enormous pressure on water resources, making increasingly precise irrigation management essential. This work proposes a hybrid approach to improve the accuracy, generalization and stability of reference evapotranspiration predictions, integrating physical laws into the neural network architecture, which makes it possible to create a model that respects both the observed data and the physical knowledge governing reference evapotranspiration. The proposed methodology is based firstly on the evaluation of three deep learning architectures with advanced attention mechanism (Attention-based LSTM, Attention-based bidirectional-LSTM, Attention-based CNN-LSTM), secondly the evaluation of the best architecture before and after the integration of the physical component (Physics-Informed Neural Networks) using a convex combination integrating the Priestley-Taylor physical model. The results show the superiority of the hybrid architectures outperforming the others, the Attention-based CNN-LSTM architecture already obtaining interesting performances (R2 = 0.934).

However, the PINNs architecture with a balance coefficient set at λ = 0.1 outperforms all other architectures with less error and better data explanation (R² = 0.945). This combination allows a reduction of the average absolute error of 7.5% compared to the Attention-based CNN-LSTM model also ensuring better stability of predictions against extreme values. The validation is carried out in a prototype connected greenhouse equipped with IoT sensors and a monitoring dashboard.

This hybrid physico-learned approach offers a scalable and interpretable solution for intelligent irrigation management in semi-arid conditions.

How to cite: Zlaiga, J., Rghioui, A., Elhachemy, S., Elyaqouti, M., and Belaqziz, S.: A hybrid physics-artificial intelligence approach for accurate prediction of reference evapotranspiration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13372, https://doi.org/10.5194/egusphere-egu26-13372, 2026.

EGU26-16806 | ECS | Orals | HS6.9

Addressing deployability concerns for AI-supported UAV-based High Throuput Phenotyping 

Jules Salzinger, Lorenzo Beltrame, Lukas-Till Schawerda, and Phillipp Fanta-Jende

Reliable, field-scale indicators of crop water status and plant condition are needed to support plant breeding, precision irrigation and climate adaptation, yet UAV-based monitoring must balance predictive accuracy with deployability. We present an explainable Deep Learning workflow (TriNet) for scalable UAV phenotyping from multispectral time series, aligned with agronomic and breeding practice through high-granularity in situ scoring in accordance with established standards (such as those of the AGES - Österreichische Agentur für Gesundheit und Ernährungssicherheit). TriNet disentangles spatial, temporal, and spectral information and incorporates attention-based interpretability to identify influential inputs and guide efficient acquisition strategies. The framework supports handling multispectral data acquired from comparatively high altitudes with respect to the state of the art (e.g., 60 m with 2.5 cm Ground Sampling Distance (GSD)), and allows the exploration of the trade-off between model performance and GSD. This supports a reduction of flight times and data volumes (e.g., 1.74 GB at 60 m vs. 5.96 GB at 20 m in our reference setup) while maintaining controlled predictive accuracy.

We study the case of winter wheat breeding, and extend this approach with new results for the traits drought stress and plant health and a comprehensive analysis of flight height as an operational design variable, systematically simulating and evaluating acquisitions from 20 to 120 m. Results indicate that predictive accuracy is largely insensitive to flight height across this range, supporting higher-altitude, high-coverage monitoring until the release of larger datasets provide a clear justification for lower-altitude, higher-resolution acquisitions. Finally, we translate these findings into practitioner-oriented operational insights for drone-based High-Throughput Phenotyping.

How to cite: Salzinger, J., Beltrame, L., Schawerda, L.-T., and Fanta-Jende, P.: Addressing deployability concerns for AI-supported UAV-based High Throuput Phenotyping, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16806, https://doi.org/10.5194/egusphere-egu26-16806, 2026.

EGU26-20878 | Orals | HS6.9 | Highlight

More crop per drop: precision irrigation and water productivity from field-scale to global scale 

Florian Werner, Matteo Ziliani, Rick Chartrand, Laurel Hopkins, Tereza Pohankova, Vivien Stefan, Rim Sleimi, Joao Vinholi, Albert Abello, and Wim Bastiaanssen

Hydrosat leverages land surface temperature measured by thermal infrared (TIR) satellite technology to help growers save water and increase yields. Key agronomical parameters, e.g., soil moisture, crop development, and crop water demand, are monitored daily over arbitrarily large areas by solving the surface energy balance. Coupled to a soil water balance model based on meteorological data, remote sensing algorithms also estimate the amount of irrigation water applied by farmers and generate irrigation recommendations optimizing water productivity, i.e., maximizing crop yield while minimizing irrigation water consumption.

IrriWatch is Hydrosat’s irrigation management decision support system, which allows growers to track water demand and growth progress of their crops down to individual 10x10 m² pixels, daily, in near-real-time. With governments becoming more conscious about conserving their water reserves, applying high-resolution remote sensing algorithms over large irrigation districts - and potentially even whole nations - is becoming increasingly relevant. Compared to small proof-of-concept models, this requires careful balancing of complex steps, including automated field delineation and crop identification at scale early in the growing season, data fusion and sharpening to obtain high-fidelity daily TIR data at a spatial resolution compatible with detecting in-field variations, as well as energy and water balance modelling capable to handle diverse environmental conditions and soil or crop types without any local data or farm management information available. To effectively help governments preserve water while increasing farmers’ crop yields, the immense amount of data generated by our models must be condensed to clear actionable indicators that are intuitive to an audience not necessarily familiar with remote sensing concepts.

We will present an overview of an operational processing pipeline to support both field-level precision agriculture applications and large-scale water productivity monitoring and optimization. Leveraging daily high-resolution land surface temperature, both from Hydrosat’s own satellite constellation and from a novel thermal sharpening algorithm, allows to track water productivity over tens of thousands of square kilometers. We find that high spatio-temporal resolution is critical to accurately monitor crop development even at regional or seasonal scale, as insufficient resolution introduces substantial errors in actual evapotranspiration estimates. In addition, correcting for geomorphological factors, e.g., microclimate or effect of elevation or slope on surface temperature, becomes increasingly important over large spatial scales.

Statistical analysis of field-scale results over large areas reveals spatial patterns of conditions responsible for yield losses or excessive water consumption. We will demonstrate how such insights support automatic identification of root causes for low water productivity, forming the basis for efficiently implementing data-driven mitigation actions.

How to cite: Werner, F., Ziliani, M., Chartrand, R., Hopkins, L., Pohankova, T., Stefan, V., Sleimi, R., Vinholi, J., Abello, A., and Bastiaanssen, W.: More crop per drop: precision irrigation and water productivity from field-scale to global scale, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20878, https://doi.org/10.5194/egusphere-egu26-20878, 2026.

EGU26-21620 | ECS | Posters on site | HS6.9

Estimating root-zone soil moisture in Mediterranean vineyards using machine learning 

Judith Cid-Giménez, Maria José Escorihuela, Anaïs Barella-Ortiz, and Pere Quintana-Seguí

Root-zone soil moisture (RZSM) reflects the water accessible to plants and is therefore central to precision irrigation support and agricultural drought monitoring, yet direct RZSM observations are limited. Satellite missions provide surface soil moisture (SSM), but they do not directly observe deeper layers, and in-situ measurements remain too sparse for broad coverage. We present a machine learning approach to estimate daily RZSM in vineyards in the Terra Alta region of Catalonia in northeastern Spain, using daily 2020 to 2024 in-situ observations from eight stations as reference data. This model provides a baseline for later experiments using satellite SSM to extend applicability beyond the instrumented network.
We train a multilayer perceptron (MLP) to predict soil moisture at 25 cm, taken as RZSM, using in-situ SSM at 5 cm, daily precipitation, mean, minimum and maximum temperature, a cyclic encoding of day of year, and static soil descriptors from SoilGrids. Robustness is assessed with year-block cross-validation to evaluate temporal generalisation and leave-station-out experiments to evaluate transferability across vineyards. Performance is quantified using non-parametric Kling–Gupta efficiency (KGE) and RMSE.
The model achieves strong skill when evaluated on independent years at training stations, with median KGE around 0.9. Transfer to unseen vineyards is more heterogeneous, with some stations retaining good performance around 0.85 and others showing biases and reduced efficiency, suggesting that additional information may be needed for consistent transfer across vineyards. Ongoing work aims to improve generalisation by incorporating antecedent moisture and precipitation information and by testing additional predictors such as vegetation, supported by feature importance analysis across the full set of inputs. To enable use beyond the instrumented network, we will transition the model towards configurations driven by or trained with satellite-derived SSM. Taken together, these steps are intended to move towards a transferable tool to support drought monitoring and irrigation-related decisions in agricultural regions.

How to cite: Cid-Giménez, J., Escorihuela, M. J., Barella-Ortiz, A., and Quintana-Seguí, P.: Estimating root-zone soil moisture in Mediterranean vineyards using machine learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21620, https://doi.org/10.5194/egusphere-egu26-21620, 2026.

EGU26-22159 | ECS | Orals | HS6.9

Leveraging Digital and Innovative Technologies for Agricultural Water Management  

Flannery Johnson, Ambe Emmanuel Cheo, Federico Alberto Santillano Sanchez, Erick Tambo, Rachid Saidou Djibo, Amadou Rabani, and Ibrahim Boubacar

Increasing climate variability and water scarcity are intensifying the need for adaptive, data-driven water management solutions in African agriculture. While digital and precision technologies are increasingly available, their effectiveness depends on how diverse data streams are integrated into usable decision support tools that respond to local conditions. This research focuses on the development of an open-source digital agriculture platform designed to support sustainable water management. 

The developed platform brings together multiple sources of real-time and near-real-time data, including IoT-based soil moisture and climate sensors, weather forecasts, crop information, and remote sensing products, and systematically compares these observations with inputted crop water and irrigation models. By continuously comparing field-level data with input models, the decision support system aims to enable more accurate irrigation scheduling, early detection of water stress, and adaptive responses to climate variability, supporting more informed, timely, and context-specific decision-making for farmers. Artificial intelligence and machine learning components can be integrated to further enhance the platform by identifying patterns, improving forecasts, and refining model performance over time. 

The presentation highlights the design and functionality of farmer-oriented decision support systems, outlining how open-source digital platforms can be tested, adapted, and refined in climate-vulnerable settings. By emphasizing interoperability, transparency, and community-driven innovation, the approach demonstrates how digital agriculture platforms can move beyond standalone technologies toward integrated decision support ecosystems for sustainable water management. 

How to cite: Johnson, F., Cheo, A. E., Santillano Sanchez, F. A., Tambo, E., Djibo, R. S., Rabani, A., and Boubacar, I.: Leveraging Digital and Innovative Technologies for Agricultural Water Management , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22159, https://doi.org/10.5194/egusphere-egu26-22159, 2026.

EGU26-778 | ECS | Posters on site | NH6.5

Python-based Automated Tool for Flood Susceptibility Modelling in Kerala, a part of Ecologically Sensitive Western Ghats, India 

Subhankar Naskar, Lokesh Tripathi, Pulakesh Das, and Sovana Mukherjee

Understanding spatial patterns of flood susceptibility is essential for targeted mitigation and resilient land-use planning, especially in ecologically sensitive zones. We present a comparative flood-susceptibility modelling framework that integrates a multi-criteria AHP (analytic hierarchy process) weighted criteria-based overlay and a data-driven neural-network (NN) classifier. The classification models are trained on a binary flood inventory map (0=No flood, 1=Flood) in Kerala, a coastal state in western India, and part of the ecologically sensitive zone of the Western Ghats. The flood inventory was developed using the microwave remote sensing data (Sentinel-1 SAR of 2018 and 2020) through Google Earth Engine (GEE) and validated through ground-based event (Actual Flood Occurrence). The study compiles an extensive set of 18 conditioning factors spanning climate and hydrology (annual precipitation, drainage density, flow accumulation, stream power), topography and morphometry (elevation, slope, profile curvature, TPI, TRI), soil wetness and permeability (soil type, soil moisture, TWI, erodibility), vegetation dynamics (NDVI, SAVI), and anthropogenic influence (built-up index, population density, built-up/impervious indices, distance to road, distance to river). Feature preprocessing included resampling, scaling, and inversion (where needed), and stratified random sampling 10 million labeled pixels (train: test = 8:2). AHP pairwise comparisons produced λmax ≈ 5.2, CI ≈ 0.05 and CR ≈ 0.05, indicating acceptable consistency. Model outputs comprised hydrological, morphometric, permeability, LULC, anthropogenic susceptibility maps and composite flood-susceptibility zonation maps from both AHP and NN workflows. Validation was performed using ROC-AUC and confusion-matrix analyses to assess predictive skill and class-level accuracy. Comparative analysis reveals that the NN approach improves predictive discrimination and spatial detail compared to the expert-driven AHP map, while AHP offers more interpretable insights of the factor weights. A Python-based application has been developed to automate flood-susceptibility mapping using dynamic precipitation and vegetation data, supporting long-term prediction and the development of mitigation measures. We discuss implications for operational flood risk mapping, targeted adaptation measures, and how combining knowledge-driven and data-driven methods can provide robust, actionable susceptibility maps for decision-makers.

How to cite: Naskar, S., Tripathi, L., Das, P., and Mukherjee, S.: Python-based Automated Tool for Flood Susceptibility Modelling in Kerala, a part of Ecologically Sensitive Western Ghats, India, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-778, https://doi.org/10.5194/egusphere-egu26-778, 2026.

EGU26-1037 | Orals | NH6.5

Storyline attribution of flash drought-heatwave compound extreme to global warming 

Devvrat Yadav, Antonio Sanchez Benitez, Helge Goessling, Marylou Athanase, Ray Kettaren, Rohini Kumar, and Oldrich Rakovec

Flash droughts (FD) are characterised by rapid depletion of soil moisture conditions. A heatwave (HW) is a period of abnormally hot weather (typically defined as lasting for three or more consecutive days). While HWs intensify through ongoing atmospheric heating, FDs result from a sudden drop in soil moisture brought on by increased evaporative demand and precipitation deficiencies. When combined, FD–HW compound occurrences can cause ecosystem disruption, hydrological stress, and significant agricultural losses. In Europe, flash droughts (FD) and heatwaves (HW) are becoming more dangerous due to changes in land-atmosphere coupling and increased warming. However, because conventional free-running climate model simulations are not the best solution to replicate the observed dynamic circumstances that drive actual events, their evolution under future warming requires a different approach. 

Here, we employ a storyline-based method that imposes counterfactual warming levels (Pre-Industrial (PI), Present-Day (PD), +2 K, and +3 K worlds) while reconstructing the synoptic conditions of recent European extremes (2018-2024) using spectrally nudged simulations of AWI-CM-1-1-MR, which are constrained toward ERA5 circulation. This approach avoids the sampling constraints of historical analogues, maintains the physical structure of the observed FD–HW sequences, and produces dynamically consistent representations of warm worlds. The mesoscale Hydrologic Model (mHM), which measures soil moisture anomalies, spatial drought extent, and compound FD–HW features throughout Europe, is driven by these climate forcings. 

Our findings demonstrate intensification in the FD and HW separately as well as when they occur simultaneously. FD events are expected to approximately double in the same time frame, while heatwaves are expected to occur 5 times more frequently and have an average magnitude more than 12 times greater in a 4K world compared to pre-industrial levels. When they happen together in a difference of less than or equal to three pentads, such events are expected to become more than 7 times more common. This work offers a solid foundation for climate-risk assessment and drought preparedness throughout Europe. 

How to cite: Yadav, D., Sanchez Benitez, A., Goessling, H., Athanase, M., Kettaren, R., Kumar, R., and Rakovec, O.: Storyline attribution of flash drought-heatwave compound extreme to global warming, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1037, https://doi.org/10.5194/egusphere-egu26-1037, 2026.

EGU26-1185 | ECS | Orals | NH6.5

Geospatial Assessment of Dam-Induced Hydrological Prosperity and Eco-Hydrological Health in Hastinapur Wildlife Sanctuary, India 

Sonali Kundu, Narendra Kumar Rana, and Vishwambhar Nath Sharma

The impact of dams on the hydrological conditions and ecological functions of wetlands has not been extensively researched. Rivers and wetlands are crucial environmental components connected to both natural and human ecosystems, making it essential to study eco-hydrological planning and its implications for human well-being. This study examines the impact of the Bijnor barrage on the hydrological prosperity and eco-hydrological alterations in Hastinapur Wildlife Sanctuary (HWS) from 1983 to 2023. The research utilizes the Indicator of Hydrological Alteration (IHA) to assess eco-hydrological thresholds, failure rates, impact magnitudes, and eco-hydrological deficits and surpluses in the river section and adjacent wetlands. The findings reveal that the percentage of very high hydrological prosperity increased to 43.703% in 2023 from 31.431% in 1983, and this is due to the disappearance of major portions of very low and low zones of hydrological prosperity. However, the total area of wetlands decreased by 62.55% and 38.12% during the pre- and post-monsoon periods, respectively. This decline corresponds with a rising failure rate of ecological optima, leading to increased eco-hydrological deficits and indicating heightened ecological distress, which could adversely affect natural and human well-being. Hydrological prosperity maps demonstrate a significant reduction in water-rich areas, with zones of "very high" and "high" prosperity in 1983 being replaced by "moderate" to "very low" zones by 2023. This trend aligns with global observations of declining wetland hydrology due to anthropogenic influences. These changes underscore the critical need for hydrological prosperity-driven ecosystem-based adaptation strategies to enhance wetland resilience and reverse negative trends. Future research should focus on quantifying the impacts of these strategies and developing tailored solutions to sustain hydrological prosperity in HWS.

 

How to cite: Kundu, S., Rana, N. K., and Sharma, V. N.: Geospatial Assessment of Dam-Induced Hydrological Prosperity and Eco-Hydrological Health in Hastinapur Wildlife Sanctuary, India, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1185, https://doi.org/10.5194/egusphere-egu26-1185, 2026.

EGU26-1186 | ECS | Orals | NH6.5

Agricultural Drought Hotspot Assessment in the Middle Ganga Plain,India, Using Multi-Parameter Approaches 

Barnali Kundu, Narendra Kumar Rana, and Vishwambhar Nath Sharma

Agricultural drought threatens food security and livelihoods in the Middle Ganga Plain (MGP),India. This study identifies agricultural drought hotspots using a multi-parameter approach, integrating a Drought Vulnerability Index (DVI) from 16 parameters and a Drought Preparedness Index (DPI) from 22 indicators. These indices were combined within a novel Vulnerability–Preparedness Framework to systematically delineate high-risk areas. The Artificial Neural Network (ANN) model has been employed to identify the hotspot zones The results show that 17.46% of the region is a drought 'Hotspot', with a critical 6.57% classified as an 'Intense Hotspot' concentrated in the districts of Gazipur, Jaunpur, Mirzapur, and Varanasi in the south western part of the study region. Analysis of the Standardized Precipitation Index (SPI) for these districts confirmed a history of recurring meteorological dry spells. Correlation analysis linked hotspot formation to high population density, a large agricultural labor force, and significant groundwater extraction. The model’s robustness was validated, demonstrating high accuracy with an Area Under the Curve (AUC) of 0.889 and strong agreement between predicted and observed data on the Taylor diagram. This study advances SDG 2 (Zero Hunger), SDG 6 (Clean Water), and SDG 13 (Climate Action) by mapping agricultural drought risks to guide sustainable water use and build climate resilience. These findings provide crucial spatial intelligence for policymakers to develop targeted interventions and site-specific water management strategies to enhance agricultural resilience in the MGP.

How to cite: Kundu, B., Rana, N. K., and Sharma, V. N.: Agricultural Drought Hotspot Assessment in the Middle Ganga Plain,India, Using Multi-Parameter Approaches, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1186, https://doi.org/10.5194/egusphere-egu26-1186, 2026.

EGU26-1293 | Orals | NH6.5 | Highlight

Observing hydrological extremes from Space 

Venkataraman Lakshmi

Land surface hydrology is a collection of complex processes. The spatial variability both the land surface properties (soil and vegetation) as well as the meteorological inputs (precipitation and radiation) play an important role in hydrology. Satellite remote sensing has a broad spatial and repeat temporal view of the land surface and is able to provide observations for use in hydrology such as soil moisture, surface temperature and vegetation density. The variability of the water cycle causes extremes such as droughts and floods and these have an impact on society. In addition, landslides, permafrost thaw and wildfires are the three other hydrological extremes that impact society. In the past two decades with the advent of improved satellite sensors, modeling and in-situ observations, quantification of the water cycle and its extremes has become possible. These satellite sensors include - microwave observations for soil moisture and precipitation; visible/near infrared for vegetation and evapotranspiration, gravity for groundwater/total water and thermal observations for surface temperature.

How to cite: Lakshmi, V.: Observing hydrological extremes from Space, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1293, https://doi.org/10.5194/egusphere-egu26-1293, 2026.

EGU26-2262 | ECS | Posters on site | NH6.5

Machine learning based prediction of long-term drought persistence over the Arabian Peninsula 

Fayma Mushtaq and Luai Muhammad Alhems

The Arabian Peninsula is among the most water-stressed regions globally, where limited precipitation, high evapotranspiration and rapid socio-economic development exacerbate vulnerability to drought. Emerging evidence indicates a significant intensification of drought conditions in recent decades, driven by climate variability and long-term warming trends posing serious challenges to water security, ecosystem stability and socio-economic resilience. Therefore, understanding historical drought dynamics, together with reliable drought prediction, is essential for strengthening drought monitoring and mitigation strategies in arid environments and for reducing drought-related risks. However, accurate drought prediction at fine resolution scale remains challenging due to the sparse distribution of meteorological stations. This study investigates the performance of the Standardized Precipitation Index (SPI) and the Standardized Precipitation Evapotranspiration Index (SPEI) at 3-, 6- and 12-month timescales using precipitation data from the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) and potential evapotranspiration derived from the TerraClimate dataset, respectively, for pixel-level drought assessment over the period 1992-2024. The historical dynamics were studied using Mann-Kendall trend, Sen’s slope and hotspot analysis. Random Forest (RF) was employed to assess its applicability for drought prediction in arid environments using satellite data, owing to its widespread adoption in global drought-prediction studies. The analysis demonstrates that the RF model exhibits high predictive performance under the studied conditions, with robust performance for SPEI-6 (R² = 0.92, RMSE = 0.12, NSE = 0.92) and satisfactory results for SPEI-12 (R² = 0.77, RMSE = 0.22, NSE = 0.77). These findings confirm enhanced predictability of seasonal to long-term drought variability across the Arabian Peninsula using a satellite-driven RF framework. The results showed the dominance of antecedent SPEI variables (>90%) indicating that cumulative moisture deficits and rising atmospheric evaporative demand primarily govern seasonal to long-term drought evolution over the Arabian Peninsula. In contrast, the consistently low contribution of SPI based indices (<3%) underscores the limited standalone role of precipitation variability in sustaining drought conditions in this arid region. Consistent with these predictive results, spatial trend analysis reveals pronounced heterogeneity in drought evolution across the Arabian Peninsula, with SPI exhibiting mixed and weak precipitation-driven signals, whereas SPEI shows widespread and statistically significant drying, particularly at 6- and 12-month timescales. This divergence further confirms that increasing evaporative demand and regional warming are the primary drivers of long-term drought intensification, reinforcing the dominant role of evapotranspiration processes identified by the machine-learning models. Therefore, the integration of satellite-derived pixel-level datasets with the RF model provides an effective framework for drought prediction across the Arabian Peninsula, offering valuable insights for water resource managers and policymakers to support the development of robust early warning systems and targeted mitigation strategies.

How to cite: Mushtaq, F. and Alhems, L. M.: Machine learning based prediction of long-term drought persistence over the Arabian Peninsula, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2262, https://doi.org/10.5194/egusphere-egu26-2262, 2026.

EGU26-2805 | Posters on site | NH6.5

A Comprehensive Assessment Framework for Drought Risk in Taiwan Using a Combined ANP-ANN Approach 

Yuei-An Liou, Trong-Hoang Vo, Duy-Phien Tran, Hai-An Bui, and Kim-Anh Nguyen

Drought is a natural hazard that has serious impacts on the environment and human society including agricultural, industrial, and domestic sectors, especially in the era of climate change. For Taiwan, drought poses a challenge particularly to the water-intensive semiconductor manufacturing industry. Comprehensive assessment is therefore necessary to identify key regions and sectors with high risk. This study utilized a combination of Analytic Network Process (ANP) and Artificial Neural Network (ANN) in an ensemble learning method to evaluate and map drought risk in Taiwan. ANP constructs a network and assigns weights to indicators while the ANN model uses these indicators to predict drought risk classes. Twenty indicators were selected representing socio-economic and environmental factors which are categorized into hazard, exposure, and vulnerability components for risk assessment. The environmental condition during the 2021 spring drought was selected to represent the drought hazard in Taiwan. The trained ANN model showed effective prediction of drought risk as indicated by performance metrics of accuracy, precision, recall, F1 score, and Kappa Index with values 0.940, 0.946, 0.938, 0.942, and 0.923, respectively. The final drought risk map was validated through fieldwork and independent statistical data. Overall accuracy values ranging 0.717-0.851 by comparing drought risk classes with indicators related to damaged crops, converted damage areas, and estimated product losses. The prediction and validation results highlight the reliability of the framework for rapid and accurate risk assessment. The framework can be applied to different natural and socioeconomic backgrounds for effective drought management to inform future long-term adaptation strategies.

How to cite: Liou, Y.-A., Vo, T.-H., Tran, D.-P., Bui, H.-A., and Nguyen, K.-A.: A Comprehensive Assessment Framework for Drought Risk in Taiwan Using a Combined ANP-ANN Approach, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2805, https://doi.org/10.5194/egusphere-egu26-2805, 2026.

Environment degradation driven by changing climatic pattern and land deterioration poses significant challenges to semi-arid region by impacting water cycle dynamics, edaphic system and landscape resilience. The Chambal basin, which is environmentally fragile and climatically unstable, studies integrating climatic variability, soil erosion and land surface assessment are limited. While addressing the gap, this study aims to assess how climatic variability influences soil erosion dynamics and land surface stability in the Chambal basin. The Modified Mann-Kendall trends were used to assess climate variability, RUSLE-based modelling was used to estimate soil erosion, and the Bare Soil Index was used to map bare soil exposure for 2001, 2012, and 2024. The findings revealed that the Modified MK Z-values for rainfall ranging from −0.83 to 3.94, illustrated heterogeneous rainfall variability indicating both declining and increasing rainfall pockets, erratic rainfall zones. While minimum temperature shows substantial variability (Z = 2.70–4.08), particularly in the southwest and northeast, maximum temperature indicates a considerably increasing but spatially consistent trend with low variability (Z = 0.33–0.75).  The estimates of soil erosion vary from 0 to 11.93 t ha⁻¹ yr⁻¹ with over 98% of the basin has very low erosion (<5 t ha⁻¹ yr⁻¹), but only a few steep, riparian, and dissected areas have slight to moderate erosion. The percentage of bare soil exposure decreased dramatically from 11.56% in 2001 to 9.53% in 2012 and then to 4.89% in 2024, showing better land cover conditions. The results indicate that despite the Chambal basin's increasing climatic stress, the terrain is still mostly stable with localized erosion vulnerability.  These insights are important for planning for erosion reduction, managing watersheds responsively to climate change, and enhancing the basin's environmental resilience.

How to cite: Kumar, A.: Assessing Climate Variability and Landscape Vulnerability in the Chambal Basin, Central India, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5359, https://doi.org/10.5194/egusphere-egu26-5359, 2026.

EGU26-6633 | Orals | NH6.5

Hydroclimatic rebound drives extreme fire in California's non-forested ecosystems   

Joe McNorton, Jessica Keune, Francesca Di Giuseppe, Marco Turco, and Alberto Moreno

The catastrophic Los Angeles Fires of January 2025 underscore the urgent need to understand the complex interplay between hydroclimatic variability and wildfire behaviour. This study investigates how sequential wet and dry periods, hydroclimatic rebound events, create compounding environmental conditions that culminate in extreme fire events. Our results show that a cascade of moisture anomalies, from the atmosphere to vegetation health, precedes these fires by around 6–27months. This is followed by a drying cascade 6 months before ignition that results in anomalously high and dry fuel loads conducive to fires. These patterns are confirmed when analysing recent (2012–2025) extreme fire events in Mediterranean and Desert Californian biomes. We find hydroclimatic rebound as a key mechanism driving extreme wildfire risk, where moisture accumulation fuels vegetation growth that later dries into highly flammable fuel. In contrast, extreme fires in the fuel-rich Forested Mountain regions are less influenced by the moistening cascade and more impacted by prolonged drought conditions, which typically persist up to 11months prior to fire occurrence. These insights improve fuel-informed operational fire forecasts for the January 2025 Los Angeles fires, particularly when year-specific fuel conditions are included. This underscores the value of incorporating long-memory variables to better anticipate extreme events in fuel-limited regions.  

How to cite: McNorton, J., Keune, J., Di Giuseppe, F., Turco, M., and Moreno, A.: Hydroclimatic rebound drives extreme fire in California's non-forested ecosystems  , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6633, https://doi.org/10.5194/egusphere-egu26-6633, 2026.

EGU26-9442 | ECS | Posters on site | NH6.5

Compound Dry and Hot extremes over the Indian subcontinent 

Anjali Ashokan and Subhasis Mitra

Severe droughts that occur alongside high temperatures and depleted soil moisture lead to compound dry–hot extremes (CDHE), having profound consequences for food security, water availability, human health and economic stability. This study uses the Blended Dry and Hot Events Index (BDHI) to identify CDHEs and to evaluate their characteristics over historical and future periods across the different climatic regions of the Indian subcontinent. The BDHI is constructed using combinations of multiple standardized indices, derived from precipitation, soil moisture and air temperature data. A novel framework is employed to identify compound events and to examine their evolution and propagation concurrently across spatial and temporal scales.  The framework, identified events of varying degrees over the Indian subcontinent, including the mega-events of 2002 and 2009, and noted considerable increases in CDHEs during the recent decades. Climate change analysis using CMIP6 model projections reveal that CDHE events are projected to increased considerably under a 3oC warming world. The study improves understanding of how CDHE stresses may differentially affect regions across the Indian subcontinent, thereby supporting climate adaptation planning and risk management in climate-vulnerable areas.

How to cite: Ashokan, A. and Mitra, S.: Compound Dry and Hot extremes over the Indian subcontinent, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9442, https://doi.org/10.5194/egusphere-egu26-9442, 2026.

Under accelerating global climate change, the increasing frequency and intensity of extreme precipitation events (EPEs) pose severe threats to socioeconomic and ecological security, highlighting the critical importance of satellite precipitation products (SPPs) for EPE monitoring. However, comprehensive multi-scale, multi-characteristic evaluations of different SPP types during EPEs remain limited. This study systematically evaluated five SPPs from three categories—satellite-derived products (IMERG-Early, IMERG-Late, IMERG-Final), reanalysis products (ERA5-Land), and merged products (MSWEP-NRT)—during an EPE in Guangdong Province, China (August 16–21, 2024), across three temporal scales (3-hour, 12-hour, 24-hour) and four precipitation characteristics (amount, frequency, intensity, duration). All SPPs exhibit significant scale dependence and systematic biases in reproducing EPEs. The IMERG near-real-time products (Early/Late) provide the best overall multi-scale performance, demonstrating superior spatial fidelity and preservation of dynamic features like intensity gradients and duration. In contrast, ERA5-Land and MSWEP-NRT suffer from excessive smoothing, while the bias-corrected IMERG-Final overly suppresses heavy rainfall intensity. A key limitation across all products is a severe underestimation of precipitation peaks. This study provides critical guidance for SPP selection in EPE monitoring and identifies that future algorithmic improvements must focus on enhancing the identification and quantitative retrieval of convective precipitation to improve reliability.

How to cite: Zhou, Z., Huang, W., Wu, H., Shen, Z., and Yu, L.: Capturing Precipitation Characteristics Across Multiple Temporal Scales: Evaluation of Satellite Precipitation Products During an Extreme Precipitation Event in Guangdong, China, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11949, https://doi.org/10.5194/egusphere-egu26-11949, 2026.

EGU26-12735 | Posters on site | NH6.5

Central European droughts, heatwaves, and wildfires in the 21st century: compound events through the lens of media 

Lukas Dolak, Jan Rehor, Barbora Plackova, and Ladislava Reznickova

Droughts, heatwaves and wildfires represent an increasing risk for both human society and the environment. Despite Southern Europe being considered one of the most vulnerable regions, ongoing recent climate change has also negatively impacted the intensity, duration, and impacts of these extreme events in Central European countries. Therefore, here we present a newly compiled database of droughts, heatwaves, and wildfires in the Central European region spanning the 2000–2025 period. The database, primarily based on newspaper and online media reports, provides information about the occurrence and duration of more than 600 extreme events, the affected areas, their impacts, or societal responses. Based on newly available data, a severity index was calculated, and the severity of individual events was assessed according to several key characteristics. Moreover, several cross-border events negatively affecting Central European countries were detected, and their joint impacts were described. Lastly, the database was utilised to identify compound events of drought-wildfire and drought-heatwave. Despite the differences among individual countries (in terms of climate conditions, landscape, population, or GDP), similar impacts and societal responses to extreme events can be observed. Analysis of these compound events revealed several joint patterns (e.g., increased mortality rates, household water supply issues, rising food prices) as well as weaknesses on the international level (e.g., a lack of available firefighting equipment during intensive wildfire periods). The obtained results support the urgent need to develop a monitoring and forecasting tool for the occurrence of drought, heatwave, and wildfire events in the Central European region and implement it in national forecasting services to mitigate the negative impacts of these extreme events.

This research is supported by the OP JAK funding under Grant No. CZ.02.01.01/00/22_008/0004635 “Advanced methods of greenhouse gases emission reduction and sequestration in agriculture and forest landscape for climate change mitigation (AdAgriF)”.

How to cite: Dolak, L., Rehor, J., Plackova, B., and Reznickova, L.: Central European droughts, heatwaves, and wildfires in the 21st century: compound events through the lens of media, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12735, https://doi.org/10.5194/egusphere-egu26-12735, 2026.

Risk arising from waterlogging in low-relief floodplain areas is manifested primarily not only by extreme rainfall but also by large linear infrastructures such as elevated railway lines or road embankments, which disrupt natural drainage pathways. Conventional flood mapping approaches often fail to capture these anthropogenic controls. This study presents an integrated framework combining machine learning techniques and Sentinel-1 synthetic aperture radar (SAR) data to map flood extent and identify infrastructure-induced waterlogging along a railway corridor in Keonjhar district, Odisha, eastern India. Time series Sentinel-1 SAR data were analysed to extract inundation and surface moisture signatures using few flood indices. The infrastructure-induced topographic modification has been quantified using two Digital Elevation Models (DEM) representing two different time periods: the first one is the pre-infrastructure SRTM DEM, and the second one is the recent high-resolution DEM generated from drone-based orthophotos. Flow accumulation and watershed boundaries have been independently derived from both DEMs to evaluate changes in drainage pathways caused by the railway embankment. After watershed delineation from two DEMs, runoff coefficients were estimated, allowing a comparative assessment of pre- and post-infrastructure hydrological response. These terrain- and watershed-based variables, together with station-based rainfall data and SAR backscatter features, were used as input parameters in a Random Forest model to classify flooded, waterlogged, and non-inundated areas, with particular emphasis on zones adjacent to the railway alignment and cross-drainage structures. The results reveal that the persistent inundation patterns is largely as a consequence of natural flow obstruction by the railway embankment and inadequate cross-drainage connectivity. By highlighting these problems, the proposed methodology helps to identify infrastructure-driven flood augmentation and supports informed planning for designing any drainage-railway crossings, strategies related to flood mitigation, and climate-resilient transport infrastructure in vulnerable regions.

How to cite: Mondal, D.: Integrating machine learning and SAR-derived flood indices to assess the railway-induced waterlogging extent , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13317, https://doi.org/10.5194/egusphere-egu26-13317, 2026.

EGU26-13539 | ECS | Posters on site | NH6.5

Mapping Future Fire Danger Against Brazil's Landscape Resilience 

Andre Simões Ballarin, Caio Simões Ballarin, José Gescilam S. M. Uchôa, Abderraman Brandão, Eduardo M. Mendiondo, Jamil A. A. Anache, Masoud Zaerpour, Shadi Hatami, Mijael R. Vargas Godoy, Edson Wendland, Paulo Tarso S. Oliveira, and Fabio de Oliveira Roque

Fire plays a central role in shaping ecosystem dynamics, biodiversity conservation, and the provision of ecosystem services; however, its role varies markedly among ecosystems. This is particularly critical in Brazil, a country that hosts globally important biomes and underpins vital functions such as climate regulation and the water–energy–food nexus. Recent observational studies indicate that Brazil is already undergoing shifts in the occurrence of extreme heat and drought events, and climate model simulations suggest that these trends will intensify in the future. However, the implications of these shifts for future fire risk patterns remain insufficiently explored, especially within an integrated risk framework that assesses how climate-driven hazard interacts with the heterogeneous resilience of ecosystems across the country.

Here, we ask how likely Brazilian ecosystems are to experience extreme fire danger conditions under future climates, and map how this hazard relates to both historical and projected patterns of landscape resilience. To this end, we perform a nationwide assessment of future fire danger using the Canadian Fire Weather Index (FWI) derived from daily CMIP6-based climate projections retrieved from the CLIMBra dataset, which was developed specifically for Brazil's climate conditions using an observational-based dataset. Employing a novel heatwave-based framework, we identify extreme fire danger events and characterize future changes in their intensity, duration, frequency, and spatial extent. Beyond this climate-based assessment, we contrast these changes from an ecosystem resilience perspective by integrating future fire danger projections with projections of landscape resilience. A Random Forest model, trained on the relationship between land cover and a map of landscape resilience classes, is applied to multiple future land-cover scenarios to estimate concurrent changes in both climate-driven fire danger and landscape resilience. This integrated approach allows us to pinpoint areas where high future fire danger overlaps with low landscape resilience.

Our results project up to approximately 30 additional compound hot-dry days per year by the end of the century across the country. These changes are expected to create a more challenging scenario for fire management, with a widespread increase in extreme fire danger across Brazil. For instance, the spatial extent and number of extreme fire danger days are projected to rise by approximately 69% and 42% on average, respectively, under intermediate-emission scenarios in the first half of the century. This integrated mapping enables us to reveal where projections of intensifying fire weather converge with those of future low landscape resilience, thereby highlighting priority regions and protected areas for targeted action. We believe that our framework will enable the integrated assessment of future fire danger and ecosystem vulnerability. These findings can guide national landscape and territorial policies by helping to prioritize actions in regions facing significant novel fire threats (transformative risk) or intensifying fire regimes (adaptive risk). They underscore the need for proactive fire management and conservation/restoration strategies that explicitly account for both climatic intensification and landscape resilience. Despite inherent uncertainties in climate and land-cover projections, our study provides a critical foundation for supporting more effective environmental planning and decision-making under a changing climate.

How to cite: Simões Ballarin, A., Simões Ballarin, C., S. M. Uchôa, J. G., Brandão, A., M. Mendiondo, E., A. A. Anache, J., Zaerpour, M., Hatami, S., R. Vargas Godoy, M., Wendland, E., S. Oliveira, P. T., and de Oliveira Roque, F.: Mapping Future Fire Danger Against Brazil's Landscape Resilience, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13539, https://doi.org/10.5194/egusphere-egu26-13539, 2026.

EGU26-15235 | ECS | Posters on site | NH6.5

Escalate Dry-hot compounded fires threaten Eurasian drylands 

Huiqian Yu

Compound climate extreme events has inflicted enormous damage since it amplifies their impacts on societies and ecosystems. However, it remains challenging to quantify its interaction and influences due to the vulnerability of drylands. We quantified the spatial and temporal pattern change, climate drivers of fire during 2001-2020 and investigated the interaction between the dry-hot conditions and fire events. The results show that fires mostly occurred in spring and autumn among three typical hotspots located in Southern of the East Europe and Central Asia, northeastern of East Asia, and Indian Peninsula. Fires in croplands accounted for 70.5% of all fire events in Eurasian drylands, with a limited size of 2.01±0.22 km2 in average. The most extensive fires were observed in grasslands, forests, shrublands, woody savannas, while the average fire burned area decreased by 0.30 km2/yr in the Eurasian dryland during 2001-2020, while dry-hot compounded fires burned area increase in 0.78 km2/yr. Dry-hot condition in early stage will increase the frequency and intensity of fire, mainly through affecting the fuel flammability and abundance. Our findings highlight the importance to understand the interrelated co-occurring climate extremes, and further efforts for monitoring and take action to reduce its threat.

How to cite: Yu, H.: Escalate Dry-hot compounded fires threaten Eurasian drylands, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15235, https://doi.org/10.5194/egusphere-egu26-15235, 2026.

High-frequency climatic extremes in rapidly urbanizing areas are becoming prominent and often reflected through enhanced thermal stress, changed moisture conditions, and heavy diurnal asymmetries, but the quantification of their spatio-temporal changes is still underestimated. This study focuses on that aspect in the National Capital Region of India with long-term satellite-derived land surface temperature (MODIS 2003-2021) with high-resolution in-situ measurements of air temperature, humidity, and wind (AWS-IMD). A spatio-temporal analytics framework, based on physical diagnostics, time-series mining, and interpretable pattern learning, is used to describe surface and atmospheric urban heat islands (UHI), urban dry islands (UDI), and the question of emergent thermal hotspots at urban-peri-urban-rural gradients.

Results indicate an increase in surface thermal extremes, where daytime SUHI warming rates are approximately 0.19°C/ yr in urban cores and as high as 0.23 °C /yr in inner-urban regions. Increase in the night-time surface temperature was more prominent, especially in inner-city areas (~0.15 °C /yr), a phenomenon suggesting the rise of nocturnal heat stress. The atmospheric UHI peaks were as high as 2.0-2.3 °C, particularly during winter mornings and pre-monsoon nights. The space-time cube hotspots analysis reveals that the persistent hotspots experienced between 2003 and 2011 have evolved to become more intense and expanse beyond 2011 with evident outward movements to the peri-urban areas. At the same time, dry seasons in urban dry islands were highly coupled between thermal and moisture extremes with −13 to −15 g /m³ (urban dry islands). In general, the results show that there is a systematic increase and spatial expansion of coupled heat and dry island extremes, which implies that urban areas with rapid urbanization are changing to more volatile and persistence urban thermal stress regime.

How to cite: Pramanik, S.: Urban Expansion Reshapes Surface and Atmospheric Heat Islands and Moisture Regimes in NCR-Delhi, India: Evidence from In-Situ and Satellite Observations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17125, https://doi.org/10.5194/egusphere-egu26-17125, 2026.

EGU26-18835 | Orals | NH6.5

Global Characteristics of Heavy Rainfall from Harmonized Geostationary Satellite Observations 

Yeji Choi, Hyun Gon Ryu, Seongryeong Choi, Jiu Park, Mahima Rao, and Kwang-min Myung

Heavy rainfall is one of the most impactful hydrometeorological extremes, frequently causing floods, landslides, and severe socioeconomic damage worldwide. Continuous, high-temporal-resolution monitoring of heavy rainfall is essential for disaster risk reduction and early warning. Recent advances in satellite remote sensing and artificial intelligence (AI) have opened new possibilities for global-scale observation and analysis of extreme precipitation by integrating multi-platform satellite data within a unified framework. In this study, we develop a harmonized global geostationary satellite dataset by integrating observations from multiple operational platforms, including the GEO-KOMPSAT-2A (GK2A), Meteosat Second Generation (MSG), and the Geostationary Operational Environmental Satellite (GOES). To address differences in temporal sampling and radiometric characteristics among these satellites, we apply a deep learning–based video frame interpolation (VFI) technique. This approach enables temporally consistent interpolation across overlapping satellite domains and facilitates the construction of seamless global cloud maps with high temporal continuity. Heavy rainfall characteristics are analyzed by linking the harmonized geostationary cloud-top observations with satellite-derived precipitation estimates produced using AI-based retrieval algorithms. These AI-driven precipitation products are designed to capture nonlinear relationships between cloud properties and surface rainfall, providing enhanced sensitivity to intense precipitation events. To assess their robustness and physical consistency, the AI-based precipitation estimates are systematically compared with conventional satellite precipitation products derived from traditional physically based or empirically calibrated retrieval methods. This comparison allows us to evaluate the added value of AI-based precipitation retrievals in representing heavy rainfall intensity and occurrence at the global scale. The analysis focuses on identifying global and regional characteristics of heavy rainfall in relation to cloud-top temperature, emphasizing climatic contrasts across tropical, subtropical, and midlatitude regimes, as well as land–ocean differences. This study demonstrates that the synergy between harmonized multi-geostationary satellite observations and AI-based precipitation retrievals provides a powerful framework for global heavy rainfall analysis. The physically interpretable relationships identified between cloud-top signals and heavy rainfall establish a solid observational basis for future AI-driven or hybrid early warning systems. By combining continuous geostationary monitoring with advanced AI methodologies, this work contributes to improved global assessment of heavy rainfall risk and supports the development of more reliable hydrometeorological early warning capabilities.

How to cite: Choi, Y., Ryu, H. G., Choi, S., Park, J., Rao, M., and Myung, K.: Global Characteristics of Heavy Rainfall from Harmonized Geostationary Satellite Observations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18835, https://doi.org/10.5194/egusphere-egu26-18835, 2026.

EGU26-21663 | Posters on site | NH6.5

Assessing Stationarity of Drought Records in the Amazon Basin using SPI and Record Theory 

Isabel Vale and Wilson Fernandes

Unlike many natural hazards whose impacts are largely localized (e.g., volcanic eruptions), droughts can generate far-reaching spillover effects that extend well beyond their region of occurrence, producing  socio-environmental consequences at continental and even global scales. Moreover, severe seasonal droughts may occur even in regions typically characterized by high levels of humidity, challenging conventional perceptions of hydroclimatic vulnerability. In particular, droughts affecting the Amazon Basin – the world’s largest watershed, characterized by high water availability and exceptional biodiversity – pose significant risks to the global climate system. Given the basin’s central role in regulating the global hydrological cycle, drought events may propagate beyond local riverine livelihoods, disrupting large-scale hydroclimatic processes and ecosystem functioning.

This study assesses whether drought records in the Amazon exhibit stationary behavior by combining the Standardized Precipitation Index (SPI), a widely used multi-timescale indicator of meteorological, with record-based stationarity tests designed to detect non-stationarity specifically in distribution tails. Monthly precipitation series from 272 rain gauge stations, each with at least 30 years of data, were transformed into SPI at a 6-month timescale. The analysis focuses on October SPI values, which integrate precipitation anomalies accumulated over the preceding dry season, allowing a consistent seasonal basis for comparison across the basin.

Stationarity is tested under the i.i.d. record hypothesis (record probability ) using non-parametric statistics proposed by Cebrián; Castillo-Mateo; Asín (2022), from the RecordTest package including the record-count -test and a weighted variant with linear weights, the likelihood-ratio test (LR), and the Foster–Stuart test, all applied to lower records representing drought extremes. Statistical significance is assessed using Monte Carlo resampling with 10,000 simulations.

The application of record-based stationarity tests indicates that drought records are predominantly stationary across the Amazon Basin. Out of the 272 analyzed stations, approximately 82% show no statistically significant departures from the i.i.d. record hypothesis in any of the applied tests. Strong and consistent evidence of non-stationarity is rare, with fewer than 3% of the stations showing simultaneous rejection across all tests. Spatially, the stations identified as non-stationary are broadly dispersed across the domain, indicating the absence of coherent regional clustering or directional gradients. These results support the hypothesis that, for the SPI-6 October series representing dry-season accumulation, the statistical behavior of drought extremes remains largely stationary at the basin scale, despite recent severe drought events reported in the literature. Overall, the proposed framework is distribution-free, tail-oriented, and computationally scalable, offering a robust methodological basis for monitoring changes in drought extremes and supporting early-warning systems and long-term water resources management in a changing Amazonian climate.

How to cite: Vale, I. and Fernandes, W.: Assessing Stationarity of Drought Records in the Amazon Basin using SPI and Record Theory, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21663, https://doi.org/10.5194/egusphere-egu26-21663, 2026.

EGU26-22368 | Orals | NH6.5

Understanding Compound Climate Hazards and Exposure form a Spatial Perspective: A Case Study in the Dosso Region, Niger 

Tatiana Gonzalez Grandon, Sari Rombach, Emmanuel Cheo, and Rainer Bell

Compound climate hazards, where extreme events co-occur, pose increasing risks to our socio-ecological systems, yet their spatial dynamics remain poorly understood. We introduce a novel metric to quantify simultaneous drought and heatwave exposure, applying it to Niger’s Dosso region over a 24-year period (2000–2023) using remote sensing and GIS-based techniques. Our analysis reveals distinct spatiotemporal patterns: Southern and northern municipalities emerge as heatwave hotspots, while drought frequency shifts from southern dominance during peak rainy seasons to central and northern prevalence throughout the rainy season, with most droughts classified as mild. The metric identifies critical years of profound compound hazard occurrence—2000, 2002, 2009, 2011, 2015, and 2021— in northern and central-eastern municipalities. By integrating multi-hazard dynamics, this innovative approach enhances understanding of localised compound climate hazard exposure and lays the groundwork to inform targeted adaptation strategies in climate-vulnerable regions.

How to cite: Gonzalez Grandon, T., Rombach, S., Cheo, E., and Bell, R.: Understanding Compound Climate Hazards and Exposure form a Spatial Perspective: A Case Study in the Dosso Region, Niger, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22368, https://doi.org/10.5194/egusphere-egu26-22368, 2026.

Global warming has increased the amount of deadwood in forests due to wildfires, insect outbreaks, and droughts. Deadwood and fresh wood are mobilised by erosion into river systems as driftwood, forming the largest organic carbon sink due to its slow decomposition rate. Thus, event-based driftwood transport is crucial for disaster management and assessing carbon storage. Here, we applied YOLOv8 to detect driftwood using images from surveillance cameras and a drone. Three types of driftwood, i.e., instream, riverbank, and nearshore, were used from the database of Swiss, Arctic Data Center, and our own drone surveys to train the model of object detection and instance segmentation. To estimate the volume of driftwood, we compared the detected image areas with radio-frequency identification (RFID) tagged logs of known dimensions, establishing an area-to-volume conversion. Our models achieved an mAP50 of 0.96 for in-stream object detection. Applying this model to Typhoon Kong-rey in the Liwu River, we estimated an in-stream driftwood volume of 3.5×105 m3, with a carbon stock of 8.24×1010 g C, representing 0.11% of Taiwan’s annual carbon export. Furthermore, we observed that driftwood flux increases nonlinearly with river discharge. Our analysis suggests that driftwood accumulation along the outer bends of the riverbank may lead to pulsed driftwood flux. These findings highlight the significance of event-scale driftwood transport as a quantifiable component of green carbon and demonstrate the feasibility of integrating deep learning-based detection with hydrological monitoring for carbon budget assessments.

Keywords: Driftwood flux, YOLOv8, RFID, drone, green carbon

How to cite: Kong, Q.-Y., Yang, C.-J., Tsai, C.-H., and Lee, M.-Y.: Integrating deep learning detection and hydrological monitoring for driftwood flux and carbon stock estimation in a steep tropical basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4783, https://doi.org/10.5194/egusphere-egu26-4783, 2026.

EGU26-5265 | ECS | PICO | HS1.1.4

Autonomous Robotised Repeatable Soil Moisture Sampling 

Ilektra Tsimpidi, Fernando Labra Caso, Vidya Sumathy, Konstantinos Soulis, and George Nikolakopoulos

In this study, we present the continuation of the novel robotic mechanism introduced at the EGU General Assembly Conference 2025 (I. S. Tsimpidi, 2025) for autonomous soil moisture data collection. Soil moisture is vital for irrigation, flood and drought forecasting, and hydrological studies, yet shows strong spatial and temporal variability; therefore accurate measurements are required. We conduct field experiments to improve the fully autonomous robotized procedure with AgriOne, reducing sampling time and enhancing repeatability.
As the AgriOne robot, Figure 1, enables in situ, high-precision, spatially dense data collection across the field, we conducted additional field experiments to collect soil moisture data, both autonomously and manually. The AgriOne robot autonomously executes soil moisture data-collection missions, with sampling positions defined by georeferenced waypoints. The waypoints were generated in ArcGIS software using a grid creation tool, with the centre of each grid square as the selected position. The size of each grid cell was defined as 8m * 8m. In the sequel, these waypoints feed the robotic autonomous navigation system, which combines satellite positioning and motion sensors to continuously estimate the robot’s position and plan its trajectory and the sampling points, to meet the initially planned sampling protocol. For the robotic navigation, a hierarchical control architecture generates velocity commands to guide the robot to each target location with centimeter-level positioning accuracy. Upon reaching each waypoint, the system autonomously triggers a probing mechanism to collect and log soil moisture measurements before continuing to the next mission point. Manual data collection was performed by a human carrying a handheld TEROS 12 sensor connected to a Bluetooth sensor interface for instant readings in a mobile application. The positions for taking the measurement were selected using an empirical sampling method.

Figure 1: AgriOne robot with description of its components.

The first experiment was executed successfully in mid-July in a flat field with no vegetation cover, no precipitation, relative air humidity of 52%, air temperature of 20 °C and wind speed of 2 Bft. The autonomous data collection yielded data from 69 of the 73 waypoints where the robot stopped, and the manual data collection yielded data from 50 waypoints, both covering an area of 4.800m2. The second experiment was successfully conducted in mid-October in an area with low elevations and dense grass cover. On the experimental day, precipitation was absent; air temperature was 12°C, relative air humidity was 66% and wind speed was 1 Bft. In this experiment, AgriOne autonomously collected soil moisture data from 63 of the 72 waypoints where stopped, and from 41 waypoints we collected soil moisture data manually, covering an area of 4.700 m2. The results of the conducted experiments are presented on a satellite map of the tested areas, with proportioning the points based on the soil moisture values and are shown in Figure 2.

Figure 2:Presentation of the SM collected data autonomously and manually in both testing areas.

Tsimpidi, I. S. (2025). Large-scale Soil Moisture Monitoring: A New Approach. EGU General Assembly Conference Abstracts, pp. EGU25-1910.

How to cite: Tsimpidi, I., Labra Caso, F., Sumathy, V., Soulis, K., and Nikolakopoulos, G.: Autonomous Robotised Repeatable Soil Moisture Sampling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5265, https://doi.org/10.5194/egusphere-egu26-5265, 2026.

Planetary boundaries for many legacy and emerging contaminants are exceeded. Moving beyond the “safe operating space” for handling these pollutants means increased risks of tipping points which may irreversibly change the functioning of ecosystems and the services they provide, resulting in severe environmental and public health impacts.

In particular the monitoring and prediction of the strongly nonlinear behaviour of many contaminants, including pollution hotspots (locations) and hot moments (events) that disproportionally affecting catchment water quality when a significant proportion of the contaminant load is mobilised withing river catchments and transported to the river network and then further downstream, remains a significant challenge for state of the art water quality monitoring.

We here present the SMARTWATER environmental sensing platform, integrating sensor technology, network and data science innovations with and mathematical modelling with stakeholder catchment knowledge to we diagnose, understand, predict, and manage the emergence and evolution of water pollution hotspots and hot moments. We highlight how innovations in fluorescence and UV absorbance optical sensing technologies can be utilised for instance to track the drivers of extreme hypoxia events through urban and rural observatories and how the combination of easy to sense water quality proxies widely dispersed across the catchment can help optimising high-utility observational networks with regards to the placements of multi-sensor platforms as well as guiding their operation. Deploying data-science approaches including hysteresis and flushing indexes across a range of low- to higher monitoring locations revealed not only divergences in the sources and their mobilisation of different pollutant types (nutrients, DOM, metals) but also differences in their downstream evolution and spatial footprints through complex (and managed) river networks. Integrating information of the different behaviours of pollutants and functional markers such as tryptophan-like fluorescence and Chlorophyll a helped to identify pollutant specific activated source areas and mobilisation mechanisms, supporting also the development of automated event-triggered in-situ sampling solutions for analysis of emerging pollutants (including microplastics) and microbial analyses that are currently not possible to sense in-situ. Integrating this information highlights drastic differences in the contaminant specific emergence of pollution hotspots and hot moments including their large-scale footprint and longer-term relevance for catchment water pollution.

How to cite: Krause, S. and the SmartWater Team: Smart sensor networks for tracking the evolution of water pollution hotspots and hot moments through river networks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7650, https://doi.org/10.5194/egusphere-egu26-7650, 2026.

EGU26-7732 | ECS | PICO | HS1.1.4

Resolving Sequential Storm-Driven Pollutant Pulses using Novel Hydrochemical Measurements within a Reactivity-Hydrodynamic workflow. 

Chris Pesso, Ponnambalam Rameshwaran, Andrew J Wade, and Nick Everard

Within river deployments that combine Acoustic Doppler Current Profiler (ADCP) hydrodynamics with high-frequency water quality sensing now offer unprecedented detail regarding instream physical processes, chemical mixing, and nutrient transformations, but a key barrier is translating complex, high-volume datasets into process-based interpretations.  Here, we present the Reactivity Index – Hydrodynamic Index (RI - H) workflow, an approach that combines standard hydrodynamic and water-quality sensor data into diagnostic behavioural classes describing hydrochemical behaviours, demonstrated at the Kennet-Thames confluence (Reading, UK). 

The workflow was developed and tested using a remote-controlled moving-boat platform (ArcBoat) equipped with a SonTek M9 (ADCP) and a YSI EXO2 multiparameter sonde.  We collected simultaneous near-surface measurements of velocity, nutrients (NH₄⁺-N, NO₃⁻-N), fluorescent dissolved organic matter (fDOM), turbidity, and specific conductivity across a quasi-synoptic transect design (upstream controls, repeated cross-sections, and diagonal transects) on three days. The study reach was segregated into 14 spatial zones to monitor the chemical and physical changes from upstream end-members (of the Rivers Kennet and Thames), how the end-members interact and evolve downstream of the confluence, and the shift in chemical and physical behaviour under different hydrological conditions. 

RI and H were derived from solute concentrations and flow velocities. Plotting the two indices enabled each observation to be classified into one of five process-based behavioural categories (e.g., Retentive, Reactive, Low-energy depletion, Attenuating, and Conservative). Across three campaigns (including a rainfall-impacted survey), zonal contrasts in RI were consistently strong (Kruskal–Wallis ε² = 0.28–0.81; p < 0.0001), identifying zones with distinct behavioural signatures (nitrate reaction or ammonium retention zones). Extending the same logic to turbidity yielded complementary particulate-transport classes (Local Input, Advective Input, Sediment Deposition, Advective Dilution and Conservative mixing), demonstrating that the workflow was applicable for solutes and particulates. 

Our high-frequency transect sampling captured the hydrological and biogeochemical response to sequential rainfall events on 26 February 2025. Following morning rainfall, we identified a pollutant pulse characterised by elevated NH4+ and fDOM, indicative of sewage or wastewater influence in the River Kennet, which diluted progressively downstream. A late-afternoon high-intensity rain and hail event triggered a distinct second wave, marked by a sharp spike in nitrate NO3- and turbidity, characteristic of surface run-off. The rapid succession of these pulses reveals differing pollutant sources and pathways activated under varying rainfall intensities. Statistically strong spatial contrasts in reactivity persisted even during this dynamic event (Kruskal-Wallis ε² > 0.28 - 0.76 for all solutes). This outcome demonstrates the workflow can resolve within-event functional shifts, translating sensor data into a real-time diagnostic of a river's response to rainfall. The RI – H framework provides a standardised approach by enabling event-scale diagnosis of solute and sediment behaviour that cannot be resolved by fixed or point-based monitoring alone. By classifying how rivers transport and process materials in space and time using deployable sensors, the workflow offers a diagnostic, process-informed water quality assessments relevant to better understanding pollutant dispersal, chemical transformations and biota in fluvial systems. 

How to cite: Pesso, C., Rameshwaran, P., Wade, A. J., and Everard, N.: Resolving Sequential Storm-Driven Pollutant Pulses using Novel Hydrochemical Measurements within a Reactivity-Hydrodynamic workflow., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7732, https://doi.org/10.5194/egusphere-egu26-7732, 2026.

EGU26-12962 | PICO | HS1.1.4

Calibration of oblique image-based monitoring systems using photogrammetric principles 

Hessel Winsemius, Hubert Samboko, Salvador Peña-Haro, Stephen Mather, and Hamish Biggs

Autonomous camera-based systems combined with image velocimetry analyses, enables operational river flow measurements in rapidly responding rivers. The open-source OpenRiverCam (ORC) software stack supports edge and cloud video processing, time-series generation, and rating-curve development, enabling fully operational, scalable non-contact water level and discharge estimation with relatively affordable camera systems.

Despite these advances, image calibration remains a major bottleneck for broad uptake, as it typically requires high-precision surveying of non-collinear ground control points to constrain the camera's pose. This process is often complex and relies on instruments that are not readily available to many users.

We investigate a photogrammetry-based alternative workflow for camera pose estimation for possible integration in ORC: during camera installation, users collect a set of smartphone photographs from multiple viewpoints near the camera location. A photogrammetric reconstruction using these photos together with a sample video from the installed camera, jointly estimates the camera pose and lens parameters. The resulting camera pose is then used to orthorectify videos in operational data collection. Using controlled experiments and field experiments in New Zealand, Zambia and The Netherlands, we assess here (i) the accuracy of reconstructed 3D coordinates compared to traditional calibration, (ii) methods to robustly constrain the horizontal plane, (iii) the number of photographs required, and (iv) the influence of GPS accuracy on the solution.

This approach aims to significantly simplify calibration workflows and lower the barrier to deploying camera-based river monitoring systems.

 

How to cite: Winsemius, H., Samboko, H., Peña-Haro, S., Mather, S., and Biggs, H.: Calibration of oblique image-based monitoring systems using photogrammetric principles, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12962, https://doi.org/10.5194/egusphere-egu26-12962, 2026.

EGU26-13018 | PICO | HS1.1.4

Potential and limitation of Landsat, Sentinel-2 and Planet datasets in monitoring the intermittency regime of non-perennial rivers 

Maria Nicolina Papa, Carmela Cavallo, Lucio Iantorno, Isabelle Brichetto, Giammarco Manfreda, Giovanni Negro, and Paolo Vezza

One of the major difficulties in studying and protecting non-perennial rivers is the lack of knowledge about the occurrence of dry periods and their duration. Traditional flow measurement systems are not reliable in measuring zero or near-zero flows and do not provide any information on the presence of isolated ponds during periods of no flow. In this context, satellite observations can make a crucial contribution thanks to their global coverage and high observation frequency, especially if freely available data can be exploited. In the field of multispectral satellite data, the free datasets provided by Landsat (USGS/NASA) and Sentinel-2 (ESA) are particularly useful. Another dataset with interesting features is that of PlanetScope, but unfortunately this data is not free, although available on request for research purposes. In this study, we present an analysis of the potential and limitations of these three datasets in observing the intermittency regime of non-perennial rivers. The differences in their spatial, temporal and spectral resolution make them more or less suitable for monitoring specific rivers with given characteristics and observation requirements. It emerged that thanks to a long archive of observations (more than 40 years), Landsat is particularly useful for analyzing changes in the intermittent flow regime over time, enabling the detection of climatic trends over the standard climatological period of 30 years but due to coarse spatial resolution (30 m) it only allows observation of rivers with sufficiently wide active riverbeds (around 90 m or more). Thanks to the finer spatial resolution (10 or 20 m depending on the band), Sentinel-2 allows observation of water features greater than 6-15 m in rivers larger than approximately 30 m for an observation period that currently stands at 9 years. Thanks to the short revisit time of 5 days or less and the free availability of data, this dataset is particularly useful for continuous observations and obtaining the annual intermittency regime in a larger set of rivers of adequate size. PlanetScope provides data with spatial resolution of around 3 m and a revisit time up to 1 day. Although the spatial resolution is significantly higher than that of Sentinel-2, the ability to observe small water surfaces is not improved proportionally. In fact, we have found that this data allows for the observation of water features greater than 4-10 m in rivers larger than approximately 20 m. This is likely due to the different spectral characteristics of the acquired data. Another factor affecting performance is related to the acquisition time, which for Sentinel-2 is the same for all acquisitions of the same scene, while for PlanetScope images it is variable. This leads to inconsistency in the dataset, making it more challenging to identify water surfaces.

For all considered datasets, the “flowing,” “ponding,” and “dry” phases can be distinguished in a supervised manner using false-color images or automatically by exploiting the reflectance characteristics of water. The performances of both supervised and unsupervised classification are analyzed for different datasets and in various case studies.

How to cite: Papa, M. N., Cavallo, C., Iantorno, L., Brichetto, I., Manfreda, G., Negro, G., and Vezza, P.: Potential and limitation of Landsat, Sentinel-2 and Planet datasets in monitoring the intermittency regime of non-perennial rivers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13018, https://doi.org/10.5194/egusphere-egu26-13018, 2026.

EGU26-13052 | PICO | HS1.1.4

Velocity Hydrograph Routing to Enable Discharge Estimation at a Channel Section 

Muthiah Perumal and C. Madhusudana Rao

Recent developments in hydrometric practices enable the measurements of continuous maximum surface velocity of flow passing at a river section and the corresponding water level using a combined surface velocity and water level sensors installed across river bridges. These measurements enable continuous discharge estimation at these river section by employing the well-studied entropy methods. However, the cost of equipping these combined radars at many gauging stations of a river may be prohibitive. But the use of standalone water level radars at many stations may not be cost-wise prohibitive. Taking into consideration of this aspect, the current study proposes a novel method of routing the upstream estimated velocity hydrograph to a desired downstream station of a river reach, which is equipped with a water level sensor, and using the routed velocity hydrograph and the water levels measured at that station, one can estimate the corresponding discharge hydrograph. The proposed study establishes the equation governing the velocity hydrograph propagation in a channel reach which is of the same form as that of the weak-diffusive wave equations governing the discharge and flow depth hydrographs propagation. The derived velocity routing equation is of the same form as the Muskingum routing equation. The parameters of the routing method are estimated using the channel and velocity characteristics of the propagating velocity hydrograph. The proposed velocity routing method is tested by routing the hypothetical velocity hydrographs arrived at by routing a given hypothetical discharge hydrograph defined by Pearson Type-III mathematical function at the inlet of 25 uniform trapezoidal channel reaches each characterised by a unique combination of bed slope and Manning’s roughness characteristics. The benchmark solutions were arrived using the HEC-RAS model. The routed velocity hydrographs enable the close reproduction of the corresponding estimated benchmark velocity hydrographs and, thus, proving the appropriateness of the proposed velocity hydrograph routing method.  

How to cite: Perumal, M. and Rao, C. M.: Velocity Hydrograph Routing to Enable Discharge Estimation at a Channel Section, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13052, https://doi.org/10.5194/egusphere-egu26-13052, 2026.

EGU26-13155 | ECS | PICO | HS1.1.4

Identifying River Plastic Hotspots from Space 

Ámbar Pérez-García, Graciela Amanda, José Fco. López, Marc Russwurm, and Tim H.M. van Emmerik

Rivers play a key role in the transport and retention of floating debris, including plastics. Reliable and scalable monitoring of riverine plastic accumulation is essential for identifying hotspots, understanding debris movement, and supporting mitigation strategies. However, conventional in situ monitoring approaches are often labor-intensive, spatially limited, and difficult to deploy consistently across large or remote river systems. This study presents a semi-automated, image-based monitoring framework that integrates satellite remote sensing and machine learning to detect and map riverine plastic accumulation hotspots at a global scale.

The methodology integrates high spatial resolution imagery for precise manual annotation of accumulation areas and multispectral Sentinel-2 data for classification of litter hotspots using Random Forests in Google Earth Engine. The workflow combines the most influential spectral bands with targeted spectral indices, including NDVI, PI, FDI, and SI13, to enhance class separability between plastic, water, and vegetation.

The methodology is evaluated across three highly polluted river systems in Indonesia, Guatemala, and Ghana. These sites represent a wide range of hydrological and environmental conditions, including floating vegetation, canopy shading, and narrow urban channels affected by pixel mixing. Results demonstrate high within-river classification performance, with overall accuracies up to 99.5% on independent sections of the same river, and robust cross-river generalization when spectral indices are incorporated, achieving plastic F1-scores up to 79%.

In addition to image classification, the workflow supports multi-temporal analysis to generate hotspot frequency maps, enabling the identification of persistent plastic accumulation zones linked to river morphology and infrastructure. Feature-importance analysis highlights the relevance of specific spectral bands and indices across different environmental conditions and supports the development of reduced, generalizable models.

To facilitate reproducibility and large-scale application, the methodology is operationalized in an open-access Google Earth Engine application that enables users to apply the trained model to rivers worldwide using Sentinel-2 imagery. The proposed framework contributes to the advancement of environmental monitoring and provides a foundation for future developments toward global, long-term assessment of river plastic dynamics.

 

More information: https://doi.org/10.1016/j.isci.2025.114570

How to cite: Pérez-García, Á., Amanda, G., López, J. Fco., Russwurm, M., and van Emmerik, T. H. M.: Identifying River Plastic Hotspots from Space, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13155, https://doi.org/10.5194/egusphere-egu26-13155, 2026.

EGU26-15052 | ECS | PICO | HS1.1.4

Personal weather station rainfall data for semi-distributed flood modelling: Feasibility and limitations 

Ranka Kovačević, Alessandro Ceppi, Carlo De Michele, Roberto Nebuloni, and Andrijana Todorović

Accurate representation of the spatial and temporal variability of precipitation is a fundamental requirement for reliable flood modelling, especially if semi-distributed/fully-distributed models are used. However, official rain gauge networks often exhibit limited spatial coverage and low density, leading to substantial uncertainty in the representation of rainfall at the sub-basin scale. Recently, opportunistic precipitation observations derived from personal weather stations (PWS) have attracted increasing attention as a potential complementary data source, offering unprecedented spatial coverage. At the same time, PWS networks are characterized by heterogeneous data quality, inconsistent maintenance, frequent outages, incomplete records, and a dynamically changing network structure. Despite the attention that PWS have gained, their applicability in hydrological modelling, especially within semi-distributed modelling frameworks, has been explored in only a limited number of studies.

This study evaluates the feasibility of PWS rainfall data for semi-distributed hydrological flood modelling and outlines the conditions under which their application is appropriate. The Lambro catchment in northern Italy is used as a case study. PWS rainfall observations obtained from the Meteonetwork platform (https://www.meteonetwork.it/, Giazzi et al., 2022) and official rainfall data provided by the Lombardy Regional Environmental Protection Agency (ARPA) are used in this study. Different PWS-based rainfall datasets are created: namely, raw PWS data (PWSraw), quality-controlled PWS data (PWSqc), and data from persistent PWS stations, implying those PWS that were active over all considered storm events (PWSqc_p), and their combinations with the ARPA observations (denoted by ARPA + PWSraw, ARPA + PWSqc, ARPA + PWSqc_p, respectively).

Each set is compared to the ARPA rain gauge measurements, which are used a reference dataset. The evaluation is performed by comparing rainfall features at the point- and sub-basin scales, as well as through semi-distributed hydrological flood simulations by analyzing the impact of the rainfall input on simulated peak discharge and timing of its occurrence, and runoff volume at the basin outlet. The hydrological modelling with every rainfall dataset is performed by using the semi-distributed model developed by Politecnico di Milano (Cazzaniga et al., 2022).

The results demonstrate that quality-controlled and persistent PWS datasets (PWSqc and PWSqc_p), as well as their combination with ARPA observations, generally enhance hydrological model performance. This indicates that PWS data can provide added value for semi-distributed flood modelling when appropriately controlled and integrated with reference datasets from the official networks.

 

References

Cazzaniga, G., De Michele, C., D’Amico, M., Deidda, C., Antonio Ghezzi, A., and Nebuloni, R.: Hydrological response of a peri-urban catchment exploiting conventional and unconventional rainfall observations:  the case study of Lambro Catchment, Hydrology and Earth Sysem. Sciences, 26, 2093–2111, https://doi.org/10.5194/hess-26-2093-2022, 2022.

Giazzi, M., Peressutti, G., Cerri, L., Fumi, M., Riva, I. F., Chini, A., Ferrari, G., Cioni, G., Franch, G., Tartari, G., Galbiati, F., Condemi, V., and Ceppi, A.: Meteonetwork: An Open Crowdsourced Weather Data System, Atmosphere, 13, 928, https://doi.org/10.3390/atmos13060928, 2022.

https://www.arpalombardia.it/   

 

Acknowledgments

The authors would like to thank the COST Action “OpenSense” (CA20136) for supporting collaboration opportunities among the co-authors through the STSM program.

How to cite: Kovačević, R., Ceppi, A., De Michele, C., Nebuloni, R., and Todorović, A.: Personal weather station rainfall data for semi-distributed flood modelling: Feasibility and limitations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15052, https://doi.org/10.5194/egusphere-egu26-15052, 2026.

For years, the SATURO from METER Group has offered simple, fast, and precise field saturated hydraulic conductivity measurements. While this instrument excels at surface measurements, taking measurements at depth has been a challenge. Measurements at depth require digging a large hole, causing disturbance and compromising the readings. The new SATURO borehole attachment allows you to take readings at depths of up to 2 meters out of the box (additional depth possible with custom cable lengths). The measurement head is compact enough to go down a 4 inch (10 cm) borehole, significantly reducing disturbance, and allowing for more accurate readings in situ.  

The SATURO borehole attachment comes with everything that you need to prepare the site, and conduct the measurements, including the installation tools, and borehole auger. The attachment also works with the current SATURO control unit, if you have already purchased one. 

How to cite: Weldon, S.: A novel approach to automated field saturated hydraulic conductivity measurements at depth, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15282, https://doi.org/10.5194/egusphere-egu26-15282, 2026.

EGU26-16174 | ECS | PICO | HS1.1.4

Rapid Sedimentation Impact Assessment of Ukai Reservoir using Geospatial methodology 

Nilima Ghosh Natoo, Prasun Kumar Gupta, and Bhaskar Ramchandra Nikam

The double whammy of increasing human water consumption and the effects of warming world is poised to further shrink large lakes, especially in arid and semi-arid regions. Further, reservoir sedimentation remains a critical challenge to global water security, causing a progressive loss in storage capacity and disrupting the ecological balance of downstream river systems. The motivation of the present study is based on the fact that at present reservoir managers are dependent upon expensive hydrographic surveys to locate the sedimentation impacted area, which cannot be conducted as and when required due to financial constraints. Further, the recent floods in the Indian state of Punjab (August 2025) were related to intense rainfall, sudden sediment inflow, resulting in drastic reduction in the storage capacity of the reservoirs.

This study presents a comprehensive geospatial framework on assessment of elevation-area-capacity relationships of Ukai reservoir. The method uses multi-temporal optical and SAR satellite imageries (Landsat-9, Sentinel-2 and Sentinel-1) and corresponding altimetry water level data to delineate the water spread areas (contours) at varying elevations. Two different time-periods (historical, 2008-2010 and 2021-2022) were compared to assess the change in contours. The results of sedimentation assessment clearly show an expansion in water boundary extent in recent years compared to that in the past decade. Additionally, the findings reveal that over the last decade Ukai reservoir’s live storage capacity has significantly declined by ~200 MCM, indicating ~20 MCM annual sedimentation rate. The spatial analysis distinctly maps the geographical areas of sediment accumulation or erosion and show that the sediment change is not uniform.

How to cite: Ghosh Natoo, N., Kumar Gupta, P., and Ramchandra Nikam, B.: Rapid Sedimentation Impact Assessment of Ukai Reservoir using Geospatial methodology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16174, https://doi.org/10.5194/egusphere-egu26-16174, 2026.

EGU26-16213 | ECS | PICO | HS1.1.4

Assessment of Optical and Near-Infrared Proximal Remote Sensing for Suspended Sediment Concentration Estimation under Artificial and Ambient Illumination 

Aung Chit Moe, Domenico Miglino, Ruodan Zhuang, Khim Cathleen Saddi, Lucrezia Viscido, Monton Methaprayun, Naw Shareen, Tanabadee Budrach, Punpim Puttaraksa Mapiam, Thom Bogaard, and Salvatore Manfreda

Proximal remote sensing represents an effective approach for water quality monitoring, enabling the estimation of turbidity and suspended sediment concentration (SSC) through spectral indices, such as red–green band ratios. Low-cost RGB cameras are widely adopted for this purpose, however, their measurements are strongly affected by variations in illumination, shadows, surface glint, and ambient environmental conditions, which can compromise data consistency and reliability. Extending the spectral coverage into the near-infrared (NIR) domain has the potential to enhance sensitivity to suspended sediments and reduce the influence of variable lighting conditions. Although hyperspectral sensors remain costly and impractical for routine monitoring, the analysis of hyperspectral data provides valuable insights into the most informative wavelengths and supports the targeted integration of RGB imagery with selected NIR bands for future field applications.

In this laboratory study, proximal hyperspectral sensing was employed to investigate SSC under both artificial and ambient illumination conditions, using two sediment types with contrasting optical properties (yellowish soil and white China clay). The experiments assess the influence of illumination conditions and sediment characteristics on spectral signatures, and compare the performance of reflectance information derived from the RGB and NIR spectral ranges. The results offer initial insights into sediment–reflectance interactions and contribute to the development of more robust and cost-effective proximal remote sensing strategies for water quality monitoring in real-world environments.

 

Keywords: Proximal remote sensing; hyperspectral data; suspended sediment concentration; laboratory experiments

How to cite: Moe, A. C., Miglino, D., Zhuang, R., Saddi, K. C., Viscido, L., Methaprayun, M., Shareen, N., Budrach, T., Mapiam, P. P., Bogaard, T., and Manfreda, S.: Assessment of Optical and Near-Infrared Proximal Remote Sensing for Suspended Sediment Concentration Estimation under Artificial and Ambient Illumination, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16213, https://doi.org/10.5194/egusphere-egu26-16213, 2026.

EGU26-19490 | PICO | HS1.1.4

Towards a Low-Cost River Monitoring Setup 

Salvador Peña-Haro, Hubert T. Samboko, and Hessel C. Winsemius

Deployment and upscale of traditional monitoring systems have different challenges, specially in developing countries, because of high investment costs, difficult operation and maintenance. In the last years there have been several low-cost, open-source products developed as well as initiatives with the objective of tackling down those issues.

Herein we present the setup and first leanings of the project L-DaaS “Local people for Discharge monitoring as a Service” where we proposed a scheme of using an open-source software for flow monitoring and a low-cost hardware which use standard components with a business model centred around a local enterprise in charge of the operation and maintenance. The project was executed in Zambia with a locally-driven environmental monitoring company based in the same country. Key stakeholders where the Water Resource Management Authority of Zambia and hydropower operators.

Open-source and low-cost system are not by themselves a solution, it is also needed to create a sustainable and scalable business model which fosters affordable, efficient, and locally supported water management solutions.

How to cite: Peña-Haro, S., Samboko, H. T., and Winsemius, H. C.: Towards a Low-Cost River Monitoring Setup, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19490, https://doi.org/10.5194/egusphere-egu26-19490, 2026.

EGU26-19511 | ECS | PICO | HS1.1.4

AI-Driven Photogrammetric Workflow for Low-Cost Wadi Monitoring 

Robert Krüger, Pedro Zamboni, Jens Grundmann, Ghazi Al-Rawas, and Anette Eltner

Arid regions such as Oman are increasingly susceptible to severe flash floods driven by climate change and rapid urbanization. Accurate water level measurements are vital for flood preparedness and the development of early warning systems required to mitigate severe socio-economic impacts, including substantial property damage and the recurring loss of life. Beyond disaster mitigation, recording runoff is essential for sustainable water management and for enhancing the understanding of hydrological processes in small-scale, ephemeral catchments that remain largely ungauged. However, traditional water level monitoring via pressure gauges or radar sensors is often hindered by high infrastructure costs, physical vulnerability to high-flow events, and the changing morphology of wadi channels. To address these limitations, we present a robust photogrammetric workflow integrated into a low-cost, Raspberry Pi-based optical monitoring system for measuring water level, surface velocities and discharge assessment.

The workflow relies on the synergy between a single fixed low-cost camera and high-resolution Digital Terrain Models (DTMs) generated through UAV-based Structure-from-Motion (SfM-MVS). To convert 2D image measurements into 3D object space, both the camera and the DTM must be referenced in a shared coordinate system. Traditionally, this is established using permanent Ground Control Points (GCPs) measured with RTK GNSS; however, establishing and maintaining such markers in adverse wadi conditions is logistically challenging and the physical markers are prone to being lost during flood events. We address this by employing the GIRAFFE (Geospatial Image Registration And reFErencing) workflow. This approach replaces physical markers by performing an image-to-geometry registration that aligns the real 2D camera view with a synthetic image rendered from the UAV-based 3D pointcloud. Using the AI-based LightGlue matching algorithm, the system automatically identifies homologous points between the views to create 2D–3D correspondences. These correspondences function as pseudo-control points, allowing for the precise determination of the camera’s 3D pose and orientation via spatial resection.

For the hydrological monitoring, the workflow further employs two AI-driven stages:

Water Level Estimation: Convolutional Neural Networks (CNNs) segment the water area in time-lapse images. The resulting waterlines are projected into 3D space and intersected with the DTM to derive accurate water levels.

Discharge Assessment: Surface flow velocities are measured using the PIPs++ (Persistent Independent Particle tracker) technique. Unlike traditional frame-by-frame methods, PIPs++ tracks particles across multiple time steps jointly, providing enhanced temporal smoothness and robustness against illumination changes or partial occlusions. Based on these surface velocities, the mean velocity is determined and combined with the wetted cross-section from the DTM to estimate total discharge.

Initial results from deployments in Wadi Al-Hawasinah, Oman, demonstrate that this solar-powered, remote system successfully captures ephemeral flow events. By leveraging GIRAFFE for automated localization and PIPs++ for robust surface velocity estimation, this workflow provides a scalable and cost-effective solution for enhancing flood early warning systems in complex, ungauged terrains.

How to cite: Krüger, R., Zamboni, P., Grundmann, J., Al-Rawas, G., and Eltner, A.: AI-Driven Photogrammetric Workflow for Low-Cost Wadi Monitoring, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19511, https://doi.org/10.5194/egusphere-egu26-19511, 2026.

EGU26-21112 | ECS | PICO | HS1.1.4

Understanding Pumping Dynamics in Granitic Hard Rock Aquifers Using Integrated Flow Metering and Camera-Based Monitoring 

Lakshmikantha N r, Aditya Vikram Jain, Ananya Jain, Karan Misquitta, Vivek Grewal, and Veena Srinivasan

Granitic hard rock aquifers dominate much of semi-arid peninsular India and are characterized by highly heterogeneous, fracture-controlled flow systems with limited storage. In these settings, conventional indicators such as static water levels and standard well drawdown equations provide poor insight into actual aquifer stress and pumping sustainability. This study presents an integrated hydrological monitoring approach that combines flow meters, borewell camera scans, and continuous camera-based observations to directly understand aquifer behaviour under pumping. We deploy non-invasive flow measurement to quantify real-time abstraction, alongside step-drawdown tests and downhole camera surveys to identify active fracture zones, their depth-wise contribution to yield, and their dynamic response during sustained pumping. Continuous camera scans during pumping cycles enable direct visualization of drawdown, fracture inflows, and the rapid transition from borehole storage to fracture-limited supply, revealing why prolonged pumping from deeper depths often leads to high energy use with marginal water gains. By linking pumping rates, energy consumption, and observed subsurface flow processes, the study demonstrates how mismatches between pump capacity and fracture-controlled yields drive inefficiency and accelerated aquifer stress. The results highlight the value of image-based and sensor-driven monitoring for developing context-specific indicators of groundwater stress and for identifying optimal pumping regimes in hard rock aquifers. This integrated methodology offers a scalable pathway to improve hydrological understanding, support adaptive groundwater management, and inform incentive-based interventions aimed at conserving both water and energy in data-scarce, remote settings.

How to cite: N r, L., Vikram Jain, A., Jain, A., Misquitta, K., Grewal, V., and Srinivasan, V.: Understanding Pumping Dynamics in Granitic Hard Rock Aquifers Using Integrated Flow Metering and Camera-Based Monitoring, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21112, https://doi.org/10.5194/egusphere-egu26-21112, 2026.

EGU26-23288 | PICO | HS1.1.4 | Highlight

Opportunistic Sensing of Precipitation and Evaporation Using Microwave Links From Cellular Communication Networks 

Remko Uijlenhoet,  Bas Walraven, Luuk van der Valk, Miriam Coenders, Rolf Hut, Aart Overeem, and Oscar Hartogensis

Precipitation and evaporation are the two fluxes coupling the atmospheric and terrestrial compartments of the hydrologic cycle. Accurate and robust observations of the spatial and temporal variability of these two fluxes over the Earth’s continents is crucial to help understand the intricacies of land surface – atmosphere interactions. Improving our understanding and our ability to quantify these interactions is not only important for scientific purposes (such as developing better earth system models) but also for societally relevant applications (such as flood and drought forecasting). Here, we demonstrate the potential and address the limitations of microwave links from cellular communication networks for estimating both precipitation and evaporation.

Previous research has shown that attenuation of microwave signals propagating through rainfall from the transmitting to the receiving antennas of microwave links can be related to the average rainfall intensity along the path between transmitter and receiver. Over the past two decades, this notion has been successfully applied to retrieve rainfall fields from existing microwave links which are part of cellular communication networks. Rain-induced signal loss due to absorption and scattering of microwave signals by raindrops, a source of “noise” for mobile network operators, has turned out to be a “signal” for hydrometeorological science and applications. The approach of using existing cellular communication infrastructure for environmental monitoring (in this case rainfall measurement) has been dubbed “opportunistic sensing”.

However, atmospheric constituents between the transmitters and receivers of microwave links do not only affect signal propagation when it rains. When it is dry, refractive index fluctuations induced by temperature and water vapor variations resulting from rising turbulent eddies in the atmospheric boundary layer between transmitters and receivers cause received signals to “scintillate”. The variance of these scintillations has been shown to be related to the structure parameter of the refractive index, which in turn can be related to sensible and latent heat fluxes across the microwave link path using Monin-Obukhov Similarity Theory (and the aid of auxiliary information). This principle is used by microwave scintillometers, commercially available instruments for observing turbulent fluxes in the atmospheric boundary layer.

Recent research results show that microwave links from cellular communication networks can, under certain conditions, also be employed as boundary layer scintillometers. Combining this notion with the previous finding that such microwave links can also be used as path-average rain gauges suggests that there is potential to use each of the roughly five million backhaul links from cellular communication networks worldwide as combined precipitation-evaporation sensors. Hence, gaining access to received signal level data from this enormous number of microwave links would allow large-scale rainfall and evaporation mapping, also for regions across the globe which are currently poorly served in terms of dedicated meteorological stations.

We present both the physical basis of this approach and empirical results from previous and ongoing measurement campaigns to discuss the potential and challenges of opportunistic sensing of two hydrologic fluxes with one single instrument: precipitation and evaporation.

How to cite: Uijlenhoet, R., Walraven,  ., van der Valk, L., Coenders, M., Hut, R., Overeem, A., and Hartogensis, O.: Opportunistic Sensing of Precipitation and Evaporation Using Microwave Links From Cellular Communication Networks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23288, https://doi.org/10.5194/egusphere-egu26-23288, 2026.

Irrigation water management is a critical factor that influences crop biomass, yield, and water usage, since irrigation makes the crop development independent of rainfall. Poor irrigation management can result in many problems on the farm and off the farm, such as waterlogging, erosion, and non-point source pollution. Therefore, improving irrigation water-use-efficiency is essential to reduce the amount of water needed without penalizing the yields. Considering the growing competition for water resources, there is a need to explore novel methods for quantifying and enhancing water use efficiency in irrigated fields, such as Unmanned Aerial Vehicle (UAV)-based remote sensing. This study integrates UAV-derived vegetation indices with machine-learning (ML) algorithms to quantify biomass and yield response of rice under alternate wetting and drying (AWD) and wheat under different irrigation methods (drip, sprinkler, and flood) with variable rates of crop evapotranspiration (100%, 75%, 50% and 0% rainfed treatment) across two seasons of the rice-wheat cropping system in Roorkee, India. The biomass and yield results obtained from the different ML algorithms were compared. During the training process of the ensemble random forest model, it performed better with a higher KGE (0.91) and a lower value of NRMSE (0.033), and a minimal PBIAS of 0.13%. The ensemble random forest model performed better during the testing process of the rice yield estimation (R2 = 0.60, KGE = 0.71, PBIAS = −2.26%, NRMSE = 0.136). For wheat yield estimation, training results were similar with strong model performance (R2 = 0.8137, KGE = 0.83, PBIAS = 1.36%, NRMSE = 0.470). The UAV-ML workflow captured both the fine-scale spatial variability needed for site-specific field decisions and the process understanding needed for generalization across the seasons. This integrated workflow supports the UN Sustainable Development Goals (SDGs), specifically SDG 2 (Zero Hunger) and SDG 6 (Clean Water and Sanitation).

How to cite: Kumar Vishwakarma, S., Kothari, K., and Pandey, A.: Spatial Mapping of Biomass and Yield of Rice-Wheat Cropping Systems across Different Irrigation Methods Using UAV Images and Machine Learning Algorithms , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-452, https://doi.org/10.5194/egusphere-egu26-452, 2026.

EGU26-1139 | PICO | HS6.7

Floodalyzer: A QGIS Plugin for Accessible and Rapid Flood Event Assessment 

Luisa Fuest and Antara Dasgupta

Floods are among the most devastating natural disasters, causing significant loss of life and economic damage. As extreme flood events become more frequent, rapid and accessible flood analysis tools are crucial in guiding early recovery efforts. This study presents the QGIS plugin ‘Floodalyzer’ developed to provide a quick and easy workflow for flood event analysis. By automating the processing and visualization of flood extent data from the Global Flood Monitoring System (GFM), derived from remote sensing, in combination with building footprints from various data sources, the plugin enables users to analyze past flood events without requiring expert knowledge or expensive proprietary software.

Floodalyzer operates within the widely used open-source GIS platform QGIS, making it highly accessible. Users manually download raster data and shapefiles from the web, which serve as inputs for automated analysis. The plugin then processes the data and generates output files, including a shapefile showing which buildings were flooded and for how long. Additionally, it compiles a HTML report including graphs that further describe the area of interest and summarize the plugin’s results (e.g. Building Footprint Heatmap, Observed Flood Extent Raster Calendar Display, Flooded Area Duration Bar Chart). The effectiveness of the tool was evaluated using case studies in Pakistan and Germany, where results were compared against CEMS’s Rapid Mapping Product. The CEMS product was not captured at the time of maximum flooding and therefore shows smaller inundated areas in many places compared to the plugin’s results. However, the locations and overall shapes of the flooded areas are generally consistent.

The case studies highlight the unique selling point of Floodalyzer – it’s ability to process flood extent data over extended time periods to analyze flood duration and damage, which enables a more comprehensive analysis of the available data. At the same time the results highlight uncertainties in flood extent, primarily originating from the GFM input data. Large exclusion mask areas indicate zones of high uncertainty, especially in urban environments where flood detection is more challenging. Temporal uncertainties also arise from gaps in satellite coverage, limiting data availability, especially in regions between the tropics.

Future improvements will focus on reducing runtime, and integrating statistical uncertainty assessments in the plugin’s output with human-readable explanations. Further, automated GFM data retrieval from the Global Flood Awareness System automating the download of the flood masks given an input AOI, would eliminate the need for manual downloads and thereby streamline the analysis process. By bridging the gap between high, complex data amounts and the need for a rapid response to flooding events, this tool provides decision-makers with a sound basis for dealing with the impacts of flooding in the response and recovery phase. Floodalyzer thus supports improved flood management through broader uptake of remotely sensed flood information, by lowering barriers to accessibility for flood extent data.

How to cite: Fuest, L. and Dasgupta, A.: Floodalyzer: A QGIS Plugin for Accessible and Rapid Flood Event Assessment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1139, https://doi.org/10.5194/egusphere-egu26-1139, 2026.

Extreme rainfall events have become more frequent and intense under climate change, presenting increasing challenges for hydrological monitoring and flood risk management. High-resolution rainfall observations are essential for capturing the spatial and temporal variability of storm events, yet conventional rain-gauge networks suffer from limited spatial coverage and cannot resolve rapidly evolving convective structures. Moreover, high-intensity rainfall events are inherently rare in natural settings, resulting in data gaps in upper rainfall categories. To address this limitation, we integrate natural rainfall observations with controlled artificial rainfall experiments to construct a comprehensive and balanced multi-class dataset covering 0–70 mm/hr at 5 mm/hr intervals. We develop a multimodal deep learning framework that jointly leverages rainfall imagery and acoustic measurements for rainfall-intensity estimation. The two sensing modalities provide complementary physical information: imagery captures streak morphology, drop density, and spatial distribution patterns, while acoustics encode drop momentum, kinetic energy, and impact signatures. Neither modality alone fully characterizes rainfall processes across all intensity ranges; by combining them, the model benefits from richer and more discriminative features. Two-second audio segments are converted into log-mel spectrograms, and a Cross-Attention fusion mechanism enables the network to selectively emphasize the most informative cues from each modality for different rainfall categories. Image-based data augmentation such as horizontal flipping further expands the training space and improves model generalization.

Compared with previous studies that relied on single-modality inputs or coarse categorical schemes, our framework achieves a substantially finer classification resolution (0–70 mm/hr in 5-mm/hr bins) and exhibits improved discrimination between adjacent intensity levels. The multimodal architecture consistently outperforms single-modality baselines, with the performance gains being particularly notable in the moderate-to-heavy rainfall range, where the model achieves higher classification accuracy, highlighting the benefits of true cross-modal complementarity. The integration of artificial and natural rainfall further produces a balanced and physically representative dataset that captures both controlled high-intensity scenarios and real-world variability.Overall, this study demonstrates the potential of multimodal sensing and deep learning to advance rainfall monitoring capabilities. The proposed non-contact, low-cost, and high-resolution approach offers a promising pathway for enhancing rainfall observation in regions with sparse gauge coverage, strengthening flood early warning systems, and supporting real-time hydrological applications under a changing climate.

How to cite: Lin, C.-C. and Ho, H.-C.: Cross-Attention Multimodal Learning Using Image and Audio for Rainfall Intensity Estimation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2660, https://doi.org/10.5194/egusphere-egu26-2660, 2026.

EGU26-10725 | ECS | PICO | HS6.7

On the optimisation of numerical weather prediction model configuration for improved flood forecasting 

Elena Leonarduzzi, Katrin Ehlert, David Leutwyler, and Massimiliano Zappa

Hydrological forecasts are essential for the timely and accurate prediction of flooding events, which are among the most impactful natural hazards for both infrastructure and human life in Europe and many other regions worldwide. Most existing flood warning systems are supported by hydrological models. Their accuracy depends not only on the representativeness and proper calibration (when required) of the model itself, but also on the quality of its inputs. While static inputs, particularly soil parameters, are highly uncertain, weather forecasts are arguably the most influential drivers.

In this study, we recreate the entire operational modelling framework used in Switzerland. Weather forecasts are provided by ICON (MeteoSwiss) and are used as input for WaSiM (FOEN), which produces streamflow predictions and issues warnings when necessary. We focus on several case studies, including selected catchments (e.g., Thur) and historical events that exceeded national flood warning levels (e.g., 30 May–2 June 2024).

This setup allows us to experiment with different configurations of the numerical weather prediction (NWP) model and to assess their downstream impacts on hydrological forecasts. We test different lead times to evaluate how early flood peaks can be detected, varying ensemble sizes to determine how many members are required to capture “extreme” flooding scenarios, and different spatial resolutions (500m – 2km) to assess the impact of resolving small-scale processes (e.g., convection).

Model performance is evaluated using classical hydrological metrics (NSE, KGE, RMSE, etc.), as well as more operationally relevant metrics for warning systems, such as whether thresholds are exceeded, how early exceedances occur, and their duration. Finally, we test different products for initializing model runs, either interpolated station-based products or NWP analysis products and assess the influence of the hydrological model itself through a sensitivity analysis of its parameters.

The results of this study will shed light on how NWP model configurations affect flood forecasting and, in turn, improve flood early warning design and decision-making.

How to cite: Leonarduzzi, E., Ehlert, K., Leutwyler, D., and Zappa, M.: On the optimisation of numerical weather prediction model configuration for improved flood forecasting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10725, https://doi.org/10.5194/egusphere-egu26-10725, 2026.

EGU26-15447 | ECS | PICO | HS6.7

Remote sensing–based urban floodplain mapping: the added value of UAV-LiDAR compared to global and GNSS-derived DEMs 

Eduardo Luceiro Santana, Laura Martins Bueno, Gabriel Souza da Paz, Rafael De Oliveira Alves, Tamara Leitzke Caldeira, Samuel Beskow, Aryane Araujo Rodrigues, Julio Cesar Angelo Borges, Denis Leal Teixeira, Gustavo Adolfo Karow Weber, and Diuliana Leandro

Flood risk management in urban floodplains strongly depends on the spatial resolution of digital elevation models (DEMs), which control floodplain connectivity, flow pathways, and surface storage. In many developing countries, flood-related studies rely predominantly on publicly available global DEM products, whose spatial resolution and vertical accuracy are often insufficient to represent subtle topographic gradients, densely vegetated floodplains, and complex urban microtopography. These limitations are particularly critical in low-relief environments, where small elevation differences exert a disproportionate control on inundation extent and flood dynamics. This issue has become increasingly evident in subtropical lowland regions of southern Brazil, where extreme flood events in 2023–2024 exposed shortcomings of commonly used global DEMs for urban floodplain applications. Therefore, the Piratini River watershed has been the focus of ongoing efforts to develop a real-time hydrological forecasting system to support decision-making during flood emergencies under data-scarce conditions. The urban areas of Pedro Osório and Cerrito along the main floodplain of the Piratini River constitute the core operational domain of this system and are recurrently affected by flooding. The watershed drains approximately 4,700 km² upstream of the municipalities and is characterized by low relief and wide floodplains. This study investigates the applicability of publicly available global DEMs and locally derived high-resolution elevation datasets for floodplain mapping and hydrological–hydrodynamic applications in these urban areas. A comparative assessment was conducted using two global DEM products - ALOS PALSAR (12.5 m) and ANADEM (30 m) - and three locally derived DEMs generated from high-resolution surveys. Local datasets include two Global Navigation Satellite System (GNSS) Real-Time Kinematic (RTK)–based surveys (static and kinematic) acquired with an Emlid Reach RS2+ receiver using real-time corrections via NTRIP (Networked Transport of RTCM via Internet Protocol), and an unmanned aerial vehicle (UAV)–based Light Detection and Ranging (LiDAR) survey acquired with a DJI Matrice 350 RTK platform equipped with a Zenmuse L2 sensor. The static GNSS survey comprised 2,921 points, while the kinematic survey yielded approximately 34,000 at a 1-s sampling interval. The UAV–LiDAR survey covered 21.5 km² of the urban floodplain. Raw elevation data from local surveys were converted from ellipsoidal to orthometric altitude using the hgeoHNOR2020 geoid model. GNSS-derived altitudes were interpolated using ordinary kriging in ArcGIS Pro. LiDAR data were processed in DJI Terra, resulting in a high-density point cloud (> 98 points m⁻²) and a terrain model with decimetric spatial resolution. Results reveal clear differences among datasets. Global DEMs show limited capability to represent floodplain connectivity and microtopography, particularly in vegetated areas. GNSS RTK–based DEMs provide intermediate performance but are constrained by survey logistics and GNSS signal degradation. In contrast, the UAV-based LiDAR DEM provides the most detailed and hydrologically meaningful representation of floodplain morphology, including vegetated and off-street areas, enabling improved delineation of flow paths and floodplain storage. These findings highlight the critical role of high-resolution elevation data for floodplain mapping and hydrological–hydrodynamic analyses in low-relief urban environments, reinforcing UAV-based LiDAR as a key remote sensing tool for risk assessment and climate adaptation in data-scarce regions.

How to cite: Luceiro Santana, E., Martins Bueno, L., Souza da Paz, G., De Oliveira Alves, R., Leitzke Caldeira, T., Beskow, S., Araujo Rodrigues, A., Angelo Borges, J. C., Leal Teixeira, D., Adolfo Karow Weber, G., and Leandro, D.: Remote sensing–based urban floodplain mapping: the added value of UAV-LiDAR compared to global and GNSS-derived DEMs, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15447, https://doi.org/10.5194/egusphere-egu26-15447, 2026.

Flood risk assessment and capturing complex inundation dynamics increasingly relies on high-resolution Earth observation data and artificial intelligence (AI). This study presents a AI-driven geospatial framework for integrated flood susceptibility mapping and wet-season surface water persistence analysis. Flood susceptibility is quantified using machine-learning and deep-learning models trained on multi-source environmental predictors.  A long-term satellite time series are analyzed to derive spatial metrics of surface water frequency and persistence.

Results demonstrate that integrating surface water persistence substantially enhances the interpretation of AI-based flood susceptibility maps. It provides added value for flood risk assessment and management compared to event-based mapping alone. The proposed framework contributes to next-generation flood risk monitoring by coupling remote sensing, AI, and temporal hydrologic information, and offers a transferable foundation for data-driven flood management and decision support under increasing climate variability.

 

 

How to cite: Golmohammadi, G. and Tziolas, N.: Integrating Flood Susceptibility and Surface Water Persistence Using Geospatial AI for Flood Risk Monitoring, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15771, https://doi.org/10.5194/egusphere-egu26-15771, 2026.

Earthquakes can cause rapid changes in elevation and topographic relief, which, in turn, affect hydrologic regimes and modify flood risk in affected regions. The regulatory floodplain, an area of elevated flood hazard adjacent to water bodies, is critical for managing exposure and mitigating flood risk in nations. Shifts in the distribution of flood risk in regions impacted by seismic activity constitute a compound hazard. Tools are needed to enable reevaluation of regulatory flood maps after seismic events, minimizing exposure of affected populations to additional flood risk. In the United States, floodplain mapping is primarily implemented by the Federal Emergency Management Agency (FEMA), known as regulatory flood mapping. They are based on Hydraulic modeling and delineate the floodplain for areas representing a 1% annual chance of flooding. The floodplain map is not updated regularly by FEMA; it relies on manual, costly revision processes and does not consistently use current, high-resolution, and up-to-date elevation data. Therefore, these maps will struggle to detect recent flood behavior, thereby increasing flood risks and limiting the effectiveness of regulatory flood mapping management. This study presents a rapid, satelliteintegrated framework for updating regulatory flood maps in regions exposed to topographic shifts from earthquakes. Using the 2019 Ridgecrest earthquake sequence as a case study in the North and South Fork Kern River basin, California. Specifically, we used the U.S Geological Survey 3DEP/NED with 10-m resolution DEM, which represented the pre-earthquake topography, integrated with a vertical displacement data derived from InSAR time series analysis to generate a corrected post-earthquake DEM. Both DEMs were then used in the HEC-RAS model to quantify changes in floodplain extent and inundation patterns under multiple return-period scenarios. To assess model performance and quantify the accuracy improvements in regulatory flood mapping, observed flood inundation maps derived from high-resolution PlanetScope satellite imagery were used in the validation. Our integrated approach demonstrates how InSAR-updated topography improves floodplain mapping accuracy and enables rapid updates to regulatory flood maps. HEC-RAS modeling results across three reaches along the North and South Fork Kern River consistently showed larger flood extents in post-earthquake simulations relative to pre-earthquake conditions. Validation using PlanetScope-derived flood inundation maps demonstrates improved model performance for the post-earthquake DEM, with an F-score 84.52% compared to pre-earthquake simulations, using an optimal NDWI threshold of 0.35.

How to cite: Al-Amry, N. and Carter, E.: Assessing Fluvial Flood Risk Changes Using an Updated Digital Elevation Model Post-Earthquake: A Case Study, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15883, https://doi.org/10.5194/egusphere-egu26-15883, 2026.

EGU26-16153 | ECS | PICO | HS6.7

Virtual Reality–Based Visualization of Urban Flood Dynamics Using SWMM 

Jiye Park, Minjeong Cho, Gihun Bang, Minhyuk Jeung, Daeun Yun, and Sang-Soo Baek

Urban flooding and water pollution have become increasingly severe challenges worldwide as a result of climate change and rapid urbanization, posing substantial risks to public safety, urban infrastructure, and environmental quality (Mark et al., 2004; Andrade et al., 2018). Intense rainfall events frequently exceed the capacity of urban drainage systems, leading to surface inundation and the transport of pollutants into receiving water bodies. To address these issues, numerical hydrological and hydraulic models have been widely applied to simulate urban runoff processes, sewer network performance, and water quality dynamics. Among these models, the Storm Water Management Model (SWMM) is one of the most commonly used tools for analyzing urban drainage systems and pollutant transport under various rainfall scenarios (Gironás et al., 2010). Despite its widespread adoption and robust modeling capabilities, SWMM primarily presents simulation outputs in the form of numerical tables and two-dimensional graphs. This conventional output format limits intuitive interpretation and restricts the ability to analyze spatial and temporal flood dynamics within complex urban environments (Zhang et al., 2016). This study proposes a virtual reality (VR)–based visualization framework that integrates SWMM simulation results with the Unity game engine to enhance the interpretability of urban flooding and water quality simulations. In the proposed framework, rainfall–runoff processes, inundation depth, and pollutant diffusion are first simulated using SWMM for a selected urban catchment. The resulting hydrological and hydraulic outputs are then converted into data formats compatible with the Unity environment. A three-dimensional urban model is constructed to represent surface topography and drainage infrastructure, enabling the visualization of flooding processes in a spatially explicit manner. Flood extent and water depth are visualized dynamically within the virtual environment, allowing users to observe flood propagation over time. In addition, pollutant transport is represented using color-based visualization techniques, where variations in color indicate changes in pollutant concentration. This approach provides an intuitive representation of water quality degradation during flood events. The VR system supports interactive exploration through the use of head-mounted displays and motion interfaces, enabling users to navigate the virtual urban space and examine flooding and pollution patterns from multiple perspectives. The immersive nature of the VR environment enhances spatial perception and facilitates a more comprehensive understanding of complex flood processes compared to traditional two-dimensional visualization methods. By allowing users to directly experience simulated flood scenarios, the proposed framework supports more effective interpretation of model results and improves communication of flood risk information. The results of this study demonstrate that VR-based visualization has significant potential as a decision-support tool for urban flood risk assessment, emergency response planning, and disaster management training.

How to cite: Park, J., Cho, M., Bang, G., Jeung, M., Yun, D., and Baek, S.-S.: Virtual Reality–Based Visualization of Urban Flood Dynamics Using SWMM, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16153, https://doi.org/10.5194/egusphere-egu26-16153, 2026.

The accelerating impacts of climate change and subsequent impact on urban environments such as flooding risks, extreme heat and heavy rain, necessitate rapid and integrated planning strategies. Urban Digital Twins (UDT) have emerged as valuable tools, offering the ability to dynamically model, simulate, and visualize complex processes to support data-driven decision-making. However, a comprehensive strategy that supports the integration of the multitude of UDTs that is being developed specifically into climate adaptation measures, while ensuring interoperability, digital sovereignty and stakeholder participation, is still lacking.

This contribution introduces the collaborative project LINKUDT (“Coordination and Collaboration Platforms for the Synergetic Conception, Development, Interoperability, and Digital Sovereignty of Urban Digital Twins”). Funded by the German Federal Ministry of Research, Technology and Space for a duration of 48 months, LINKUDT serves as the overarching companion research project for six regional real-world laboratories across Germany. The primary objective of the project is to establish UDTs as central instruments for speeding up urban planning processes to improve climate adaptation and sustainable urban development by identifying synergies and supporting interoperability.

A core challenge addressed by LINKUDT is the creation of interoperable and sustainable data infrastructures. Following the FAIR principles (Findable, Accessible, Interoperable, Reusable), the project aims at advancing standards that allow for the efficient integration of heterogeneous data sources, such as sensor networks and environmental models. To prevent vendor lock-in and ensure long-term data portability, LINKUDT emphasizes digital sovereignty through the use and further development of open-source software modules and standards (e.g., OGC API Processes, SensorThings API, CityGML).

Further key outcomes of LINKUDT include training modules for stakeholders /e.g. public administration, developers), and policy recommendations for the nationwide application of digital twin technologies.

By linking the National Research Data Infrastructure for Earth System Sciences (NFDI4Earth) with administrative data infrastructures (GDI-DE), LINKUDT creates a scalable model for evidence-based urban governance. 

With our contribution we aim to reach out to further digital twin initiatives related to climate change to initiate further exchange on interoperability, digital sovereignty and emerging technologies.

How to cite: Jirka, S., Radtke, J., and Reiß, J.: LINK Urban Digital Twinning (LINKUDT): Advancing Climate Adaptation and Planning Acceleration through Interoperable Digital Twin Ecosystems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17814, https://doi.org/10.5194/egusphere-egu26-17814, 2026.

Non-contact river monitoring is essential for understanding hydraulic phenomena and providing real-time disaster mitigation information during large-scale floods. Our previous research (Yorozuya et al., 2026) developed a method to inversely estimate riverbed elevation by integrating UAV-derived surface velocity (via PIV) and water surface geometry (via LiDAR) into a Physics-Informed Neural Networks (PINNs) framework using automatic differentiation of the governing equations. However, that approach relied on a uniform velocity correction factor across the entire reach, which led to significant underestimations of water depth in complex flow fields, such as those near spur dikes.

In this study, we propose an enhanced estimation algorithm that incorporates secondary flow effects into the momentum equations to improve bathymetric accuracy. Following the methodology of Iwasaki et al. (2013), we identify regions where surface velocity vectors exhibit curvature and account for the resultant increase in flow resistance. This approach aims to correctly identify water depth even in regions where surface velocities are low but hydraulic complexity is high.

Field experiments were conducted in a reach of the Kurobe River (bed slope ≈1/100, 20m wide by 50m long), characterized by a spur dike in the center of the domain. High-resolution water surface geometry and velocity fields were captured using a UAV-mounted LiDAR (DJI Zenmuse L2) and a photogrammetric camera (P1). These data were integrated into the PINNs loss functions, which were defined based on the continuity equation, the shallow water equations, and the conservation of discharge across cross-sections.

The results demonstrated a marked improvement in estimation reliability, particularly in the separation zones downstream of the spur dike. Without secondary flow considerations, the model estimated near-zero water depth in large wake vortices due to the low surface velocities. By incorporating secondary flow effects, the model correctly evaluated the increased apparent roughness due to flow curvature, yielding deeper and more accurate bathymetry consistent with ground-truth data obtained by boat-mounted ADCP. This study highlights the potential of using only UAV-based remote sensing to achieve high-precision bathymetric inversion in morphologically complex river environments.

Iwasaki, T., Shimizu, Y., and Kimura, I. (2013). An influence of modeling of secondary flows to simulation of free bars in rivers. Journal of Japan Society of Civil Engineers, Ser. B1 (Hydraulic Engineering), Vol. 69, No. 3, 147–163.

Yorozuya et al. (2016) Seeing the unseen, RiverFlow2026 (Under review)

How to cite: Yorozuya, A., Inaba, R., and Kudo, S.: Bathymetry Estimation in Complex River Morphology using UAV-based Remote Sensing and Physics-Informed Neural Networks Incorporating Secondary Flow Effects, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17859, https://doi.org/10.5194/egusphere-egu26-17859, 2026.

LiDAR-derived digital elevation models (DEMs) are increasingly adopted in hydrodynamic flood modelling; however, their direct use, particularly in complex urban environments, remains problematic. Although LiDAR provides high-resolution surface information and supports the generation of bare-earth digital terrain models (DTMs), unresolved flow-permeable structures such as bridges, culverts, and elevated transport infrastructure, together with micro-scale urban features including narrow river channels, pathways, kerbs, and missing submerged channel bathymetry, systematically distort flow connectivity and channel conveyance. These deficiencies introduce structural biases into flood simulations, yet existing studies typically address individual features in isolation, limiting transferability and large-scale applicability.

This study reframes LiDAR DEM preprocessing as a process-based investigation into how unresolved terrain features bias flood hydraulics and introduces an automated, physically consistent terrain reconstruction framework that explicitly targets these bias mechanisms. The framework is implemented at the national scale using the 2 m LiDAR-derived DTM for England.

Three dominant sources of hydrodynamic bias are addressed. First, flow-permeable structures, including bridges, culverts, and elevated transport infrastructure, are systematically identified using observed water surface information and river network data, and the terrain beneath these structures is reconstructed using interpolation-based techniques to restore hydraulic connectivity. Second, impermeable urban features, such as buildings and kerbs, are selectively elevated while preserving longitudinal connectivity along roads and pathways, ensuring realistic overland flow routing. Third, submerged river bathymetry is reconstructed using empirical relationships between river width and water depth to recover channel conveyance absent from bare-earth DTMs.

The resulting terrain dataset is directly applicable to hydrodynamic flood modelling without manual intervention. Sensitivity analyses across multiple historical flood events demonstrate that restoring flow connectivity and reconstructing channel bathymetry exert distinct and flow-regime-dependent controls on simulated flood extent, water levels, and discharge. In particular, unresolved flow-permeable structures predominantly govern urban inundation patterns, whereas missing bathymetry represents the primary source of error in channel hydraulics.

By systematically isolating and correcting key terrain-induced bias mechanisms, this study provides generalisable insights into the process sensitivity of catchment and urban flood models to DEM representation and offers a scalable pathway for improving large-scale flood simulations using LiDAR data.

How to cite: Chen, H., Tong, X., and Liang, Q.: Reconstructing Flow Connectivity and Channel Conveyance in LiDAR-Derived Terrain for National-Scale High-Resolution Flood Modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22261, https://doi.org/10.5194/egusphere-egu26-22261, 2026.

  The conventional hydrocarbon accumulation model in the Xihu Depression is predominantly characterized by “late-stage accumulation.” However, with advancing exploration, the potential occurrence of commercially significant early hydrocarbon charging events in the Paleogene Pinghu Formation has become a subject of considerable debate. This study examines the accumulation mechanisms of natural gas in the Pinghu Formation through an integrated approach incorporating scanning electron microscopy (SEM), systematic fluid inclusion analysis, and natural gas carbon isotope geochemistry, with a particular focus on the evolutionary patterns of authigenic illite within the reservoir.

  SEM observations reveal three distinct morphological types of authigenic illite in the Pinghu Formation reservoirs: honeycomb, bridge-like, and fibrous. The crystallization of these illite types is primarily governed by diagenetic temperature and pore fluid pH: honeycomb illite forms at low temperatures (60 to 110°C) via smectite transformation; bridge-like illite develops at 120 to 140°C in association with acidic dissolution of K-feldspar; and fibrous illite requires temperatures above 140°C and alkaline conditions for the illitization of kaolinite. A key anomaly contradicting conventional diagenetic sequences was identified: in the shallower and cooler Huagang Formation reservoirs, fibrous illite constitutes up to 76% of the illite assemblage, whereas in the deeper and presumably hotter Pinghu Formation reservoirs, honeycomb and bridge-like types dominate (collectively 65%), with markedly reduced overall abundance. This inverse distribution with depth is interpreted as evidence of early hydrocarbon charging during deep burial of the Pinghu Formation. The introduction of acidic hydrocarbons inhibited the transformation of kaolinite to fibrous illite, thereby preserving the earlier illite morphologies and providing direct mineralogical evidence for an early accumulation event during the Huagang Movement.

  Geological analysis further supports the coupling of key elements conducive to early accumulation: during the Huagang Movement, source rocks had reached burial depths sufficient for hydrocarbon generation (Ro ≥ 0.5%), providing a material basis for large-scale expulsion. Concurrently, the superposition of the Yuquan and Huagang movements facilitated the development of structural–lithologic traps. At this stage, the average porosity of the Pinghu Formation reservoirs was approximately 21%, not yet entering the tightening phase, providing high-quality reservoir space for early hydrocarbon filling and accumulation.

  Fluid geochemical data provide additional robust evidence: hydrocarbon inclusions exhibiting yellow fluorescence with homogenization temperatures peaking between 105 and 135°C record an early hydrocarbon charging event. Furthermore, the methane δ¹³C values of Pinghu Formation natural gas (–38‰ to -34‰) are significantly lighter than those of the overlying Huagang Formation (–34‰ to 29‰), consistent with an early-generated, low-maturity gas source, effectively distinguishing fluid origins between early and late accumulation phases.

  Based on the above research, an early accumulation model governed by the combined effects of “paleo-highs and high-quality reservoirs” is established for the Pinghu Formation. This provides a key predictive model for early-stage reservoir exploration in basins with similar geological conditions worldwide, thereby further expanding new exploration frontiers.

How to cite: Li, L. and chen, Z.: Evidence from Illite Crystal Evolution: Exposing the Early Phases and Patterns of Hydrocarbon Accumulation in the Pinghu Formation of the Xihu Depression in the East China Sea., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-58, https://doi.org/10.5194/egusphere-egu26-58, 2026.

Rare-earth elements (REE) form indispensable components of daily life, as they are essential constituents of the modern high-technology applications, including clean energy, high-tech electronics, and ultimately to achieve the sustainable development goals of the United Nations. With a growth rate of approximately 10–15% per year, the demand for REE has been increased significantly. However, production and supply chains of REEs are very limited, especially due to the rare occurrences and/or discoveries of REE-enriched deposits. It also invokes an alarming situation, since the REE industry is largely controlled by a small number of countries across the globe, with one holding the dominant position in both mining and processing. Consequently, there is an increasing interest in the REE exploration studies across the globe for finding out new potential sources.

Granitic pegmatites are considered as important sources of rare metals, such as REEs, and other high-field strength elements (HSFE) such as U, Th, Y, Zr, Hf, Nb, Ta and large-ion lithophile element (LILE) such as Li, Rb, and Cs. Here, we report the occurrence of rare-metal granitic pegmatites associated with alkaline granite complex of Munnar in the southern Indian shield. The mineralized pegmatites are intruded along and across the shear planes of granites. The pegmatites are composed of quartz, K-feldspar, plagioclase, biotite and muscovite. Several veins also contain magnetite, pyrite and pyrrhotite. They are characterized by high ΣREEs contents ranging from 1318 ppm to 7682 (avge. 3992 ppm). The chondrite-normalized REE patterns of the pegmatites are characterized by a strong enrichment of LREE over HREE, with a (La/Yb)N ratio between 42 and 1000, with characteristic negative Eu anomalies. The ΣREE of host granites ranges between118 and 6502 ppm. The REE patterns of the pegmatites suggest that the pegmatites are formed from LREE enriched melt, generated possibly during the shearing of host granitic rock. During this process the incompatible REEs are concentrated in the melt causing LREE enrichment, which eventually intruded into the lower curst as granitic pegmatites. This indicates enhanced mobility of REE during alteration of host granites. Thus, the study imposes important insights into the sources and enrichment mechanisms of REEs in the parent rocks as well as their remobilization during alteration processes forming ion-adsorption REE deposits in their weathered crusts.

How to cite: Chettootty, S., Sivankutty, R., and Vasundharan, K.: Rare earth element (REE) enriched granitic pegmatites associated with alkaline granite complex of southern India: Source characteristics, enrichment mechanisms, and insights into potential ion-adsorption REE deposits, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-790, https://doi.org/10.5194/egusphere-egu26-790, 2026.

EGU26-3352 | Orals | GD4.2

Numerical geodynamic modelling for natural H2 resource exploration 

Frank Zwaan, Anne C. Glerum, Sascha Brune, Dylan A. Vasey, John B. Naliboff, Gianreto Manatschal, and Eric C. Gaucher

A key challenge in the 21st century is the successful implementation of the energy transition, which hinges on the development of sustainable (energy) resources. In this context, hydrogen gas (H2) generated by natural processes is a promising source of clean energy. However, we urgently need to develop the concepts and exploration strategies for this promise of natural H2 energy to become a reality.

The most likely mechanism of large-scale natural H2 generation in nature is the serpentinization of ultramafic mantle rocks during their chemical reaction with water. In order to predict the bulk serpentinization and natural H2 generation that may lead to the development of exploitable H2 deposits, we consider the following “recipe” for efficient serpentinization, which involves three main ingredients: (1) (fresh) mantle rocks that need to be at (2) optimal temperatures between ca. 200-350˚C (the serpentinization window), and (3) in contact with ample water for the reaction to take place. The serpentinization window can be expected at 8-12 kilometers below the Earth’s surface. However, mantle rocks are normally found at much greater depth; thus these rocks must be brought closer to the surface (exhumed) through geodynamic processes. Moreover, water needs to reach such depths along large faults or other structures that cut into the exhumed mantle. The challenge we are faced with is to understand where (and when) these ingredients may come together in nature, and how much natural H2 may be generated.

Numerical geodynamic modelling is an ideal means to tackle this issue since it allows us not only to test how mantle rocks can be exhumed, but also to trace the temperature conditions and potential water availabilitiy (Zwaan et al. 2025). By combining this information, we assess favorable settings and timing of bulk natural H2 generation in different geodynamic systems. Subsequently, we consider where the natural H2 could be exploited. The serpentinizing mantle source rocks at 8-12 km depth cannot be directly targeted. Ideally, the natural H2 would instead migrate and accumulate in sedimentary reservoir rocks at depths of only a couple of kilometers that are connected with the mantle source rocks via migration pathways (e.g., faults). Importantly, all key elements need to be in place for the system to work.

Our first-order modelling work and the development of natural H2 system concepts greatly helps to direct natural H2 resource exploration efforts, for example in the Alps and Pyrenees. Moreover, substantial opportunity lies in refining both the geodynamic modelling and natural H2 system analysis, in field- and laboratory testing of our H2 system concepts, and in extending such a “mineral system” modelling approach to other types of natural resources that are crucial to the energy transition. 

Reference:

Zwaan, F., Brune, S., Glerum, A.C., Vasey, D.A., Naliboff, J.B., Manatschal, G., & Gaucher, E.C. 2025: Rift-inversion orogens are potential hot spots for natural H2 generation, Science Advances, 11, eadr3418. https://doi.org/10.1126/sciadv.adr3418

How to cite: Zwaan, F., Glerum, A. C., Brune, S., Vasey, D. A., Naliboff, J. B., Manatschal, G., and Gaucher, E. C.: Numerical geodynamic modelling for natural H2 resource exploration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3352, https://doi.org/10.5194/egusphere-egu26-3352, 2026.

EGU26-5509 | ECS | Orals | GD4.2

Numerical modeling of magma migration in lithospheric rocks 

Nima Hosseinian, Juan Carlos Afonso, Alberto García-González, and Sergio Zlotnik

Magma migration is a complex natural process that controls volcanism, the formation of many types of ore deposits, the development of geothermal reservoirs and the thermal structure, and long-term evolution of the lithosphere [1-3]. Because the dynamics of magma migration are difficult to observe directly, numerical simulations provide a powerful tool to investigate magmatic systems, the coupled physiochemical processes involved, and the range of spatial and temporal scales over which these processes operate.

In this study, we present a new multi-phase numerical framework to study magma migration within the Earth, with a particular emphasis on the mechanical interactions between melt and solid. The framework is based on multiphase flow in porous media and it incorporates realistic rheological descriptions of lithospheric rocks, including visco-elasto-viscoplastic behavior, damage, strain weakening and the generation of porosity due to plastic deformation. Interaction between the fluid (magma) and solid (host rock) phases are described via a set of equations derived from a formal phase-averaging framework. An arbitrary Eulerian-Lagrangian solver is used to discretize the equations and solve the fully-coupled system. The validity of the model, and its potential to study multi-scale magmatic systems, are demonstrated using well-known benchmark tests and targeted numerical experiments.

Keywords: Dynamics of lithosphere and mantle, Mechanics, Numerical modeling, Physics of magma, Plasticity

REFERENCES

  • [1] Keller, D. A. May, and B. J. Kaus, “Numerical modelling of magma dynamics coupled to tectonic deformation of lithosphere and crust,” Geophys. J. Int., Vol. 195, pp. 1406-1442, (2013).
  • [2] Li, A. E. Pusok, T. Davis, D. A. May, and R. F. Katz, “Continuum approximation of dyking with a theory for poro-viscoelastic-viscoplastic deformation,” Geophys. J. Int., Vol. 234, pp. 2007-2031, (2023).
  • [3] Oliveira, J. C. Afonso, S. Zlotnik, and P. Diez, “Numerical modelling of multiphase multicomponent reactive transport in the Earth’s interior,” Geophys. J. Int., Vol. 212, pp. 345-388, (2018).

 

Acknowledgment

EarthSafe Doctoral Network has received funding from the European Union’s Horizon Europe research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 101120556.

How to cite: Hosseinian, N., Afonso, J. C., García-González, A., and Zlotnik, S.: Numerical modeling of magma migration in lithospheric rocks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5509, https://doi.org/10.5194/egusphere-egu26-5509, 2026.

Eduardo Monsalvea, Claudia Pavez-Orregob, Ángela Floresa, Nicolás Barbosab, Eckner Chaljuba, Rodrigo Palma-Behnkea, Nikolai H. Gaukåsd, Didrik R. Småbråtend, Diana Comtec*.

  • a) Department of Electrical Engineering / Energy Center, Faculty of Mathematical and Physical Sciences, University of Chile, Santiago, Chile
  • b) Department of Applied Geosciences, Geophysics, SINTEF Industry, Trondheim, Norway
  • c) Advanced Mining Technology Center, Faculty of Mathematical and Physical Sciences, University of Chile, Santiago, Chile
  • d) Department of Sustainable Energy Technology, SINTEF Industry, Oslo, Norway

In a global context marked by increasing energy demand and growing constraints on the large-scale deployment of conventional renewable sources, the exploration of alternative energy pathways has become increasingly relevant. Within this framework, vibrational energy harvesting (VEH) has garnered attention due to its potential to exploit ambient energy sources that are typically overlooked, such as mechanical vibrations. In particular, seismic vibrations, both natural and anthropogenic, represent a persistent and spatially distributed energy resource in regions characterized by intense industrial activity and significant seismicity.

This study presents a systematic and replicable methodology for assessing the energy harvesting potential from real seismic vibrations, with a specific focus on high-vibration environments, such as mining areas and urban settings. The proposed framework aims to quantify both the theoretical potential of the vibrational resource, understood as the maximum energy available in the environment, and the technical potential, defined by the current capability of electromagnetic energy harvesters (EMEHs) to capture and convert this energy into usable electrical power.

The developed methodology consists of six main stages: (i) seismic data acquisition, (ii) signal preprocessing, (iii) event identification, (iv) event characterization and classification, (v) device selection, and (vi) dynamic simulation for harvested power estimation. Continuous seismic records are analyzed to detect and isolate energetically relevant events of both natural and anthropogenic origin, including earthquakes, microseisms, blasting activities, and vehicular traffic. These events are characterized in terms of amplitude, frequency content, and duration, providing objective criteria to evaluate their relevance for energy harvesting applications. Representative seismic excitations are subsequently used as non-stationary inputs to a dynamic model of an EMH, enabling the estimation of the harvested power associated with each event type without parameter optimization. This approach allows for a direct comparison between different vibrational sources under realistic operating conditions and highlights the influence of site-specific factors such as local geology, proximity to vibration sources, and spectral characteristics of ground motion.

The application of the proposed framework to a mining environment in northern Chile reveals distinct, yet partially overlapping, ranges of harvestable power across different classes of seismic events. The results demonstrate a strong spatial dependence on the vibrational energy resource and emphasize the necessity of localized assessments when evaluating the feasibility and robustness of vibrational energy harvesting systems. This work contributes a methodological foundation for resource-oriented evaluation, providing quantitative insight into whether seismic vibrations can realistically support low-power applications such as autonomous sensors and monitoring systems.

How to cite: Monsalve, E.: Evaluating Seismic Vibrations as an Energy Resource in Mining and Urban Environments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5931, https://doi.org/10.5194/egusphere-egu26-5931, 2026.

EGU26-6024 | ECS | Orals | GD4.2

Toward Efficient Stokes Flow Simulations in Multi-Observable Thermo-Chemical Tomography Using Model Order Reduction 

Mustafa Ramadan, Federico Pichi, and Gianluigi Rozza

The prevalence of viscous-dominated regimes within the Earth’s interior gives rise to Stokes-like flow systems in numerous geodynamical applications. A prominent example is sublithospheric mantle convection, which constitutes the primary driving mechanism behind the evolution of dynamic topography. In this context, numerical simulations provide more physically consistent estimates of the Lithosphere–Asthenosphere Boundary (LAB) depth than those derived from first-order isostatic approximations [1].

However, the associated computational overburden is exceptionally high, particularly when accounting for material nonlinearities. The challenge is further complicated when attempting to incorporate them within a Markov Chain Monte Carlo (MCMC) framework that requires an exceptionally large number of evaluations [2], limiting their applicability to large-scale studies and underscores the need for novel and computationally efficient Reduced-Order Modeling (ROM) methodologies [3].

Results from linear Model Order Reduction (MOR) techniques indicate that the complexity of the problem surpasses the capabilities of projection-based ROMs designed to produce globally accurate solutions. This work introduces a localized, goal-oriented criterion to enhance linear reducibility and employs Neural Network (NN) surrogates to replace high-fidelity solver evaluations. These methodological advances jointly underpin the development of a hybrid offline–online reduction framework that efficiently reduces computational complexity while preserving the required levels of accuracy, enabling seamless model updates during parameter-space exploration.

 

REFERENCES

[1] Afonso, J. C., Rawlinson, N., Yang, Y., Schutt, D. L., Jones, A. G., Fullea, J., & Griffin, W. L. (2016). 3-D multiobservable probabilistic inversion for the compositional and thermal structure of the lithosphere and upper mantle: III. Thermochemical tomography in the Western-Central U.S. Journal of Geophysical Research: Solid Earth, 121(10), 7337–7370. https://doi.org/10. 1002/2016jb013049

[2] Ortega-Gelabert, O., Zlotnik, S., Afonso, J. C., & Diez, P. (2020). Fast Stokes Flow Simulations for Geophysical-Geodynamic Inverse Problems and Sensitivity Analyses Based on Reduced Order Modeling. Journal of Geophysical Research: Solid Earth, 125(3). https://doi.org/10.1029/ 2019jb018314

[3] Hesthaven, J.S., Rozza, G., Stamm, B. (2015). Certified Reduced Basis Methods for Parametrized Partial Differential Equations. SpringerBriefs in Mathematics. Springer International Publishing AG, Cham. https://doi.org/10.1007/978-3-319-22470-1

How to cite: Ramadan, M., Pichi, F., and Rozza, G.: Toward Efficient Stokes Flow Simulations in Multi-Observable Thermo-Chemical Tomography Using Model Order Reduction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6024, https://doi.org/10.5194/egusphere-egu26-6024, 2026.

EGU26-8341 | ECS | Orals | GD4.2

Coupling Bayesian Inversion and Reduced-Order Modeling: Application to Lithosphere–Asthenosphere Boundary Estimation 

Mir Shahzaib, Pedro Díez, Sergio Zlotnik, Alba Muixí, and Macarena Amaya

Geophysical inverse problems are inherently ill-posed due to sparse, noisy, and indirect observations, making Uncertainty Quantification (UQ) a fundamental requirement for reliable subsurface characterization. Bayesian inversion provides a comprehensive probabilistic framework for inferring subsurface parameters by coherently combining prior knowledge with observational data through the likelihood function. However, the practical deployment of Bayesian methods in large-scale geophysical settings is often hampered by the prohibitive computational cost of repeated forward model evaluations. In this context, uncertainty is often not solely driven by observational noise; a substantial and sometimes dominant contribution arises from model error, resulting from simplified physical descriptions, numerical discretization, and uncertain boundary conditions. When these sources of uncertainty are neglected or inadequately represented, Bayesian inversions may yield biased posterior estimates and unrealistically narrow uncertainty bounds. These limitations are particularly acute in deep Earth applications, where complex rheologies, poorly constrained geometries, and computationally intensive forward models coexist.

A key challenge is the accurate delineation of the Lithosphere–Asthenosphere Boundary (LAB), which plays a central role in controlling mantle dynamics, lithospheric deformation, and deep geothermal processes. Despite the necessity of relying on Bayesian approaches to estimate the LAB and its associated uncertainties, the high computational cost of repeated evaluations of the forward solver makes this unfeasible within realistic time frames [1]. To address these limitations, this work investigates Reduced-Order Modeling (ROM) techniques to enable efficient Bayesian inversion of LAB geometry in geodynamical Stokes flow models. ROMs construct low-dimensional surrogates of high-fidelity solvers, allowing rapid forward simulations while preserving the dominant physical behavior of mantle flow. By integrating ROMs with Bayesian inference, the proposed framework enables effective and reliable UQ for LAB characterization.
Keywords: Geophysical inverse problems; Bayesian inversion; Uncertainty Quantification; Reduced-Order Modeling; Lithosphere–Asthenosphere Boundary

Acknowledgement This research was conducted within the EarthSafe Doctoral Network and has received funding from the European Union’s Horizon Europe research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 101120556.

References [1] Olga Ortega-Gelabert, Sergio Zlotnik, Juan Carlos Afonso, and Pedro D´ıez. Fast stokes flow simulations for geophysical-geodynamic inverse problems and sensitivity analyses based on reduced order modeling. Journal of Geophysical Research: Solid Earth, 125(3):e2019JB018314, 2020.

How to cite: Shahzaib, M., Díez, P., Zlotnik, S., Muixí, A., and Amaya, M.: Coupling Bayesian Inversion and Reduced-Order Modeling: Application to Lithosphere–Asthenosphere Boundary Estimation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8341, https://doi.org/10.5194/egusphere-egu26-8341, 2026.

EGU26-9853 | ECS | Posters on site | GD4.2

Toward integrated geodynamic-petrological modelling: coupling ASPECT with thermodynamic calculations  

Arijit Chakraborty, Jeroen van Hunen, Andrew Valentine, Sergio Zlotnik, and Alberto García González

The concentration of critical minerals and metals occurs within 200 km of the transition between thick and thin lithosphere(Hoggard et al., 2020). Understanding the mechanisms behind this distribution requires characterizing a variety of deep Earth processes of different scales and nature. Among these processes, mantle melting is a critical initial step, controlling compositions of early melts and to the stability of cratonic lithosphere. These melting processes are governed by complex phase equilibria which determines proportions and compositions of mineral assemblages, depending on pressure, temperature and bulk composition. 

 We investigate computational strategies for coupling mantle convection codes such as ASPECT with thermodynamic equilibrium calculations tools like MAGEMin. While a direct coupling would provide accurate phase equilibria predictions, it comes at a significant computational cost for large-scale geodynamic models. Our research explores developing surrogate models using machine learning and neural network techniques to approximate these thermodynamic calculations more efficiently. 

We present our preliminary research involving methodological approaches and discuss the computational trade-offs involved in different coupling strategies. A simplified geodynamic model demonstrates potential workflows for this approach. This research is a step towards a more integrated computational framework for a thermo-chemical geodynamic model, which will have important implications for modelling critical mineral formation in complex geodynamic settings. 

References:

  • Hoggard, Mark J., Karol Czarnota, Fred D. Richards, David L. Huston, A. Lynton Jaques, and Sia Ghelichkhan. “Global Distribution of Sediment-Hosted Metals Controlled by Craton Edge Stability.” Nature Geoscience 13, no. 7 (July 2020):504–10.https://doi.org/10.1038/s41561-020-0593-2 

How to cite: Chakraborty, A., van Hunen, J., Valentine, A., Zlotnik, S., and García González, A.: Toward integrated geodynamic-petrological modelling: coupling ASPECT with thermodynamic calculations , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9853, https://doi.org/10.5194/egusphere-egu26-9853, 2026.

EGU26-10570 | ECS | Orals | GD4.2

Reducing Computational Costs in 3D Magnetotelluric Simulations via Domain Decomposition and Reduced-Order Modeling 

Luis Tao, Sergio Zlotnik, Alba Muixí, Fabio Ivan Zyserman, Juan Carlos Afonso, and Pedro Diez

Three-dimensional (3D) Magnetotelluric (MT) probabilistic inversion remains rare in real-world applications because it requires solving the forward problem thousands to millions of times, often making the computational cost prohibitive. Since the total duration of an inversion is directly controlled by the performance of the forward solver, the high computational overhead of 3D MT modeling remains a significant challenge, particularly for large-scale problems requiring high mesh resolutions. To address the poor scaling of existing strategies, we introduce DD–POD, a hybrid framework that integrates Domain Decomposition (DD) with Proper Orthogonal Decomposition (POD). The DD formulation partitions the global problem into subdomains, bypassing the memory limitations of traditional direct solvers and enabling simulations with substantially finer discretizations. Implementing this distributed architecture alone yields simulations that are at least 50% faster than global full-order approaches. Building on this foundation, the integration of POD eliminates the need for repeated large-scale linear system solves within the iterative DD process, delivering total forward-solver speed-ups exceeding 90%. Benchmark experiments and a real-world case study demonstrate that DD–POD consistently outperforms standard global POD strategies in computational efficiency with an acceptable trade-off in numerical accuracy.

(This work was supported by the Marie Sklodowska-Curie Actions (Doctoral Network with Grant agreement No. 101120556))

How to cite: Tao, L., Zlotnik, S., Muixí, A., Zyserman, F. I., Afonso, J. C., and Diez, P.: Reducing Computational Costs in 3D Magnetotelluric Simulations via Domain Decomposition and Reduced-Order Modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10570, https://doi.org/10.5194/egusphere-egu26-10570, 2026.

EGU26-12436 | ECS | Orals | GD4.2

Adaptive parameterization in Bayesian inversions using transdimensional methods 

Arnau Dols, Macarena Amaya, Sergio Zlotnik, and Pedro Díez

Geothermal energy is a crucial component of the global transition to sustainable and green energy systems due to its renewable and long-term availability. In order to study potential resources, we need to describe the subsurface by solving inverse problems. The complexity and uncertainty of these problems require the use of probabilistic inversion approaches that repeatedly solve partial differential equations over a grid of parameters describing the subsurface domain. Frequently, the high dimensionality of the parameter space to be inferred implies prohibitive computational times and reduces the sensitivity of each parameter as the grid is refined. In this work, we implement and discuss adaptive parametrization strategies in Bayesian inversions. We model the thermal conductivity structure of 2D sections of the Earth's upper mantle and perform Markov chain Monte Carlo (MCMC) inversions to recover the thermal conductivity as a probability distribution based on the likelihood of the temperature measurements. To verify the solution, we first parametrize the physical properties of the subsurface domain equal to the high-dimensional finite element grid. In order to determine the optimal metaparameters on the run we rely on adaptive MCMC techniques that accelerate the convergence and reduce the risk of getting trapped in local minima. We then use a new parametrization based on the physical structure of the geological faults of the mantle that reduces the dimensionality of the problem. By relying on transdimensional sampling through reversible-jump MCMC, we consider the number of parameters as an unknown of the inversion. In these methods, the algorithm is allowed to increase the number of parameters to invert when the solutions found are not accurate enough and to decrease it when the accuracy of the solution is not significantly affected. Our results show that we recover the thermal conductivity structure both with and without adaptive parametrization, and the performance is improved when using transdimensionality. Moreover, the proposed transdimensional inversion decreases or increases the number of parameters locally, thereby providing an efficient and robust method for addressing the often challenging lack of information on subsurface heterogeneity.

Keywords: geothermal energy; Markov chain Monte Carlo; reversible jump MCMC; transdimensional inversion; adaptive parametrization; finite elements; Poisson equation.

How to cite: Dols, A., Amaya, M., Zlotnik, S., and Díez, P.: Adaptive parameterization in Bayesian inversions using transdimensional methods, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12436, https://doi.org/10.5194/egusphere-egu26-12436, 2026.

EGU26-13256 | Orals | GD4.2

Non-Intrusive POD–RBF Reduced OrderModeling for Parametric and Transient MantleConvection 

Qusain Haider, Niccolò Tonicello, Michele Girfoglio, and Gianluigi Rozza

Mantle convection plays a fundamental role in governing the thermal and dynamical
evolution of terrestrial planets, yet its numerical simulation remains computationally ex-
pensive due to strong nonlinearities, high Rayleigh numbers, and the presence of thin
thermal boundary layers. In this work, we present a non-intrusive reduced-order modeling
(ROM) framework for two-dimensional mantle convection based on Proper Orthogonal
Decomposition combined with Radial Basis Function interpolation (POD–RBF).
High-fidelity full-order model (FOM) simulations are first performed using a finite-
volume discretization of the incompressible Boussinesq equations under the infinite-Prandtl-
number approximation. The FOM is carefully validated across a wide range of Rayleigh
numbers. Particular attention is devoted to high-Rayleigh-number regimes, where mesh
refinement studies are conducted to improve accuracy and ensure reliable reference solu-
tions.
The ROM is constructed from snapshot data of velocity and temperature fields. POD
analysis reveals a rapid decay of singular values, indicating a low-dimensional structure
of the solution manifold. The parametric dependence of the reduced coefficients is recon-
structed using RBF interpolation, yielding a fully data-driven and non-intrusive ROM.
To rigorously assess predictive capability, the ROM is validated using test points ex-
cluded from the training dataset. Leave-One-Out cross-validation demonstrates that the
ROM accurately predicts unseen solutions across the parameter space, with low relative
L2 errors for both velocity and temperature fields. Field-level comparisons confirm that
the dominant flow structures and thermal patterns are faithfully reproduced.
The framework is further extended to transient simulations, where both time and
Rayleigh number are treated as parameters. This two-dimensional parametric unsteady
ROM successfully captures time-dependent dynamics while providing significant compu-
tational speed-up. The proposed approach offers a robust and efficient tool for parametric
mantle convection modeling and provides a solid basis for future extensions toward three-
dimensional configurations and uncertainty quantification.

How to cite: Haider, Q., Tonicello, N., Girfoglio, M., and Rozza, G.: Non-Intrusive POD–RBF Reduced OrderModeling for Parametric and Transient MantleConvection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13256, https://doi.org/10.5194/egusphere-egu26-13256, 2026.

EGU26-13298 | ECS | Orals | GD4.2

Enabling Probabilistic Full Waveform Inversion in Multi-Observable Thermochemical Tomography through Reduced-Order Spectral Element Modeling 

Ali Jamasb, Juan-Carlos Afonso, Alberto Garcia Gonzalez, Gianluigi Rozza, Federico Pichi, Sergio Zlotnik, Mark van der Meijde, and Islam Fadel

Multi-Observable Thermochemical Tomography (MTT) is a simulation-driven, joint probabilistic inversion framework designed to estimate the thermochemical state of the Earth’s lithosphere by integrating geophysical datasets with complementary sensitivities. By jointly inverting observables such as gravity and geoid anomalies, surface heat flow, seismic dispersion, body-wave data, and magnetotelluric responses, MTT directly estimates primary thermodynamic variables, including temperature, pressure, and bulk composition, from which all secondary physical properties are derived through internally consistent thermodynamic models. This bottom-up approach provides physically-consistent constraints on lithospheric structure across regional to prospect scales.

Within this framework, MTT offers a powerful basis for characterizing lithospheric architecture and compositional domains that are commonly examined in mineral systems studies. In particular, MTT can help delineate major crustal- and lithospheric-scale structures, identify metasomatized/altered domains, and map thermochemical contrasts that serve as lithospheric-scale proxies commonly associated with specific classes of magmatic and hydrothermal mineral systems.

Despite recent advances incorporating ray-based seismic tomography solvers (Fomin, I., Afonso, J. C., Gorbatov, A., Salajegheh, F., Dave, R., Darbyshire, F. A., et al. (2026). Multi-observable thermochemical tomography: New advances and applications to the superior and North Australian cratons. Journal of Geophysical Research: Solid Earth, 131, e2025JB031939. https://doi.org/10.1029/2025JB031939 ), the integration of full-waveform seismic solvers within the MTT framework has not yet been achieved. Full-waveform inversion (FWI) offers enhanced sensitivity to both seismic velocity and density and the potential for improved spatial resolution relative to traditional tomography approaches. However, the computational cost of FWI remains prohibitive, particularly in probabilistic or ensemble-based inversion settings required for uncertainty quantification.

This contribution presents a computational strategy aimed at reducing the cost of full wavefield simulations to enable probabilistic seismic FWI within the MTT framework. We extend reduced-order modeling (ROM) techniques to the spectral element method (SEM), which is widely used for accurate time-domain seismic wave propagation in complex geological settings. Specifically, we consider projection (Galerkin)–based ROMs in which the SEM wavefield is approximated in a low-dimensional reduced basis constructed from representative high-fidelity solutions. While ROM approaches are well established for simpler formulations, their application to SEM-based elastic wave simulations remains challenging due to the method’s high dimensionality and complex operator structure. Beyond MTT, such reductions are also relevant to SEM-based workflows that require large numbers of forward simulations, including ground motion studies and FWI with many sources at regional-to-global scales.

We develop and test a reduced-order SEM formulation using synthetic benchmark models relevant to lithospheric-scale imaging. Results demonstrate computational speed-ups of up to two orders of magnitude relative to full SEM simulations, while retaining sufficient accuracy in simulated wavefields for inversion purposes. These results represent a first proof of concept toward incorporating probabilistic FWI into multi-observable thermochemical tomography and reducing a key computational barrier to uncertainty-aware, physics-based lithospheric imaging.

How to cite: Jamasb, A., Afonso, J.-C., Garcia Gonzalez, A., Rozza, G., Pichi, F., Zlotnik, S., Meijde, M. V. D., and Fadel, I.: Enabling Probabilistic Full Waveform Inversion in Multi-Observable Thermochemical Tomography through Reduced-Order Spectral Element Modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13298, https://doi.org/10.5194/egusphere-egu26-13298, 2026.

EGU26-13600 | Posters on site | GD4.2

 Hydro-mechanical parameter estimation in earthfill dams using reduced-order models 

Sergio Zlotnik, Jan Schrader, Jarin Beatrice, Alberto García González, and Alba Muixí

The identification of hydro-mechanical parameters governing earthfill dam behaviour under transient loading conditions is essential for reliable interpretation of monitoring data and predictive analysis. Although coupled flow–deformation models can represent these processes in detail, their direct use in inverse analyses is often prohibitive due to the large number of forward simulations required. This work addresses the efficient estimation of material parameters in earthfill dams by integrating a reduced-order formulation of the problem into an inverse strategy.

A transient, nonlinear hydro-mechanical model for unsaturated soils is considered in the context of a sensor-driven inverse problem, where piezometric measurements are used to constrain model parameters. Reduced-order models based on proper orthogonal decomposition (POD) are introduced to enable repeated model evaluations within the inversion procedure while retaining the key features of the hydro-mechanical response. The framework targets the estimation of relevant soil properties, such as hydraulic conductivity, water retention characteristics, and mechanical stiffness, and is illustrated using both synthetic observations and field piezometer data from the Glen Shira dam during rapid drawdown events.

REFERENCES

[1]  Pinyol, N. M., Alonso, E. E., Olivella, S. (2008). Rapid drawdown in slopes and embankments. Water Resources Resarch, 44(5). doi: 10.1029/2007WR006525

[2]   Nasika, C., Díez, P., Gerard, P., Massart, T.J., Zlotnik, S. (2022). Towards real time assessment of earthfill dams via Model Order Reduction. Finite Elements in Analysis & Design, 199: 103666. doi: 10.1016/j.finel.2021.103666

How to cite: Zlotnik, S., Schrader, J., Beatrice, J., García González, A., and Muixí, A.:  Hydro-mechanical parameter estimation in earthfill dams using reduced-order models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13600, https://doi.org/10.5194/egusphere-egu26-13600, 2026.

EGU26-13610 | ECS | Posters on site | GD4.2

Hyper-reduced POD formulation for the hydro-mechanical assessment of tailings dams 

Alba Muixí, Lluís Monforte, Alberto García-González, and Sergio Zlotnik

The reliable assessment of tailings dam response under transient hydro-mechanical loading is a key challenge for mining infrastructure safety and risk management. High-fidelity numerical models capable of representing coupled groundwater flow and deformation in partially saturated soils provide valuable insight into internal states of the dam, but their computational demands often limit their use in operational settings, such as scenario analysis or near–real-time monitoring.

We consider a transient, nonlinear hydro-mechanical finite element model describing groundwater flow in unsaturated soils and apply a proper orthogonal decomposition (POD)–based reduced-basis formulation to accelerate simulations. While POD effectively reduces the number of unknowns, the computational cost of assembling nonlinear operators remains tied to the full-order mesh dimension, limiting efficiency gains. To address this bottleneck, hyper-reduction techniques are investigated that construct reduced approximation spaces for the nonlinear terms themselves, with the goal of alleviating computational cost relative to standard full-order finite element simulations.

How to cite: Muixí, A., Monforte, L., García-González, A., and Zlotnik, S.: Hyper-reduced POD formulation for the hydro-mechanical assessment of tailings dams, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13610, https://doi.org/10.5194/egusphere-egu26-13610, 2026.

EGU26-15494 | ECS | Posters on site | GD4.2

Seismic Super-resolution Leveraging Machine Learning Techniques  

Mukthar Opeyemi Mahmud, Andrew P. Valentine, Anne K. Reinarz, and Jeroen van Hunen

Earth imaging is central to our ability to understand our planet and is important for exploration of critical minerals, geothermal energy resources detection, and mitigation of natural hazards such as earthquakes and the study of plate tectonics. As a result, there is a need for more precise images of the earth’s interior. However, as this imaging process is ill-posed and lossy, the images obtained are inevitably a blurry version of the truth. This makes it challenging to robustly interpret results and draw inferences about geophysical systems.  

 

The full waveform inversion (FWI) has been the state-of -the-art for high-fidelity and physically consistent subsurface imaging, however, its computational expense has driven exploration into machine learning (ML) techniques. These data-driven ML techniques can perform seismic inversion, directly mapping seismic data to subsurface properties without executing the iterative physics modelling loop of FWI. While their success is highly dependent on the availability of comprehensive, high-quality training data, they have proven capable of delivering subsurface predictions orders of magnitude faster than traditional methods.

 

In our attempt to obtain physically consistent subsurface images while ensuring cheap inferences, we will explore opportunities for ‘seismic super-resolution’: generation of higher-resolution images by combining observed data with prior knowledge about likely structures and the physics of wave propagation. Our approach involves the combination of machine learning techniques for numerical upscaling and physics – informed neural networks ensuring that the underlying laws of physics are embedded within results.  

 

In this presentation, we will highlight some of the challenges and opportunities in this approach  

and present some early results from numerical experiments.

How to cite: Mahmud, M. O., Valentine, A. P., Reinarz, A. K., and Hunen, J. V.: Seismic Super-resolution Leveraging Machine Learning Techniques , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15494, https://doi.org/10.5194/egusphere-egu26-15494, 2026.

In mineral exploration, on-site analytical techniques provide tools for real-time data acquisition, supporting informed decision-making. Portable instruments such as handheld X-ray fluorescence (pXRF) and short-wave infrared (SWIR) hyperspectral spectrometers enable rapid, non-destructive collection of geochemical and mineralogical information directly from drill cores. When effectively integrated and interpreted, these datasets offer powerful tools for advancing geological understanding and refining 3D models, ultimately improving vectoring toward mineralization and supporting more efficient, sustainable exploration

Where traditional interpretation methods are often subjective and time-consuming, data-driven approaches, particularly machine learning, can identify patterns and correlations within large datasets, accelerating analysis. In this study, we propose a machine learning framework for fusing drill-core hyperspectral and geochemical point data to enhance geological modeling.

Methodologies were applied and tested in two gold target sites hosted by an Archean Ilomantsi Greenstone Belt in eastern Finland. The geology at the selected sites is dominated by visually homogeneous schistose metasediments exhibiting intense sericite–chlorite alteration. Hence, these target areas provide an ideal natural environment for evaluating machine-learning approaches aimed at refining lithological and lithogeochemical discrimination and alteration mineralogy interpretations. The data-fusion and predictive modeling approach has the potential to significantly extend the data-driven geological models in 3D to enhance geological understanding and controls of the Au mineralization.

Lithogeochemical data were first partitioned into distinct compositional groups using the K-means clustering algorithm. The resulting cluster assignments served as training labels for a supervised learning framework aimed at linking geochemical classes to hyperspectral signatures. Selected SWIR spectral parameters corresponding to geochemical sampling points, together with their assigned labels, were used to train a Random Forest (RT) classifier. The trained model was applied to unclassified spectral data to infer lithogeochemical classes to produce a predictive model.

Despite the generally noisy nature of both pXRF and spectral point data and overall, rather poor probability measures of the RT model (< 50% for most classes), in 3D, a clear and spatially reasonable model is produced. Along-strike continuation of lithogeochemical stratigraphy provides a validation argument supporting the success of the predictive model beyond areas with both lithogeochemical and hyperspectral data.

This approach leverages existing drill holes in a fast and cost-efficient manner by utilizing portable data-acquisition technologies. Machine-learning-based integration of multi-sourced datasets is demonstrated to improve lithological/lithogeochemical discrimination and predict subsurface geological features. This aids in the delineation of drilling targets more accurately, supporting dynamic, data-driven decision-making in mineral exploration.

How to cite: Luolavirta, K. and Ojala, J.: Machine learning framework for the integration of drill-core hyperspectral and geochemical point data to enhance geological modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19052, https://doi.org/10.5194/egusphere-egu26-19052, 2026.

EGU26-364 | ECS | Orals | CR6.1

Beyond Teleconnections - Uncovering Stable Drivers of Antarctic Sea Ice Anomalies 

Nina Susann Öhlckers, Dirk Lorenz, and Monica Ionita

Antarctic Sea Ice (ASI) has experienced a sudden and drastic decline since 2016, following decades of gradual growth since the start of satellite observations. This sharp reversal caused suggestions that a regime shift has happened. However, the underlying drivers remain uncertain due to complex atmosphere-ocean interactions and pronounced regional variability. While atmospheric circulation patterns and large-scale teleconnections influence ASI variability, their spatial aggregation limits their ability to explain regional changes. Recent studies point to an increasing role of ocean heat content, yet its contribution relative to atmospheric influences has not been quantified. In this study, we address this gap by developing a framework to identify stable, spatially coherent climate drivers of regional ASI. We first introduce a workflow combining correlation analysis with HDBSCAN clustering to detect global clusters that have persistent correlations with regional ASI and can serve as robust predictors. We then use these clusters as input features in a linear regression model combining atmospheric variables and ocean heat content to assess how well ASI variability can be reconstructed. Finally, we evaluate how the relative importance of atmospheric and oceanic drivers has changed before and after the extreme low-ice events beginning in 2016.

Our results demonstrate that (1) the proposed clustering framework reliably identifies physically meaningful driver regions, (2) a linear model using these drivers can successfully reproduce regional ASI variability, and (3) the contribution of ocean heat relative to atmospheric forcing varies markedly across regions.

How to cite: Öhlckers, N. S., Lorenz, D., and Ionita, M.: Beyond Teleconnections - Uncovering Stable Drivers of Antarctic Sea Ice Anomalies, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-364, https://doi.org/10.5194/egusphere-egu26-364, 2026.

EGU26-518 | ECS | Orals | CR6.1

Efficient Sea Ice Classification Built on Few-Shot Learning Framework 

yang li, Petteri Uotila, Chao Li, Matti Leppäranta, and Chongtai Peng

Deep learning (DL) methods have become a key technique for automatic sea ice type mapping from synthetic aperture radar (SAR) imagery, yet their deployment in operational sea ice charting is still hindered by scarce labelled data, limited adaptive feature extraction, and the lack of interactive mechanisms, which restrict model generalization, accuracy, and usability, especially in hard-to-classify scenes. To address these bottlenecks, we propose an efficient sea ice classification model, ESICM, targeting four ice types: open water (OW), young ice (YI), first-year ice (FYI), and multiyear ice (MYI), and enhance performance and practicality under label-scarce conditions through three key designs. First, we introduce a few-shot learning (FSL) framework to more effectively exploit limited labels and reduce the reliance of traditional supervised learning on large labelled datasets. Second, inspired by classical sea ice parameter retrieval algorithms, we design a lightweight channel multiply–divide convolution module (CMDM) that strengthens adaptive feature extraction with only ~190k parameters, thereby improving discrimination of multi-scale textures and sea ice types with subtle backscattering differences. Third, we incorporate an interactive mechanism based on the Segment Anything Model (SAM) and couple it with the FSL framework, allowing the classifier to be adjusted with minimal human intervention and thus improving operability in difficult SAR scenes. ESICM is trained on 512 scenes from the AI4Arctic sea ice challenge dataset and evaluated on 20 independent test scenes, achieving 91.73% overall accuracy (OA), 91.29% F1 score, 85.61% Cohen’s kappa, and 71.52% mean intersection over union (mIoU), outperforming comparative DL models by at least 1.35, 1.90, 2.54, and 2.53 percentage points on these metrics, respectively. In melting season scenes, particularly those dominated by MYI, ESICM’s F1 and IoU outperform the second-best model by 22.21% and 19.15%, respectively. Further cross-domain experiments demonstrate that, even when trained on only about one quarter of local scenes, ESICM still achieves the highest accuracy, demonstrating strong cross-regional generalization. Meanwhile, its interactive functionality enables users to refine classification results via prompts in hard-to-classify scenes, substantially enhancing classification performance. Overall, ESICM provides a lightweight, high-accuracy, and interactively adjustable DL solution for SAR-based sea ice classification under limited labelled data, offering robust technical support for polar navigation safety and sea ice environmental monitoring.

How to cite: li, Y., Uotila, P., Li, C., Leppäranta, M., and Peng, C.: Efficient Sea Ice Classification Built on Few-Shot Learning Framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-518, https://doi.org/10.5194/egusphere-egu26-518, 2026.

EGU26-1379 | ECS | Orals | CR6.1

Mapping the Margins: Evaluating Accuracy and Ambiguity in Automated Rock Glacier Delineation 

Sunil Tamang, Shelley MacDonell, James Brasington, James Shulmeister, and Benjamin Aubrey Robson

Rock glaciers, the most visible surface expression of permafrost landforms, are found across glacial, periglacial and paraglacial environments. Accurate and consistent mapping of their extent is fundamental for advancing research in geomorphology, hydrology, ecology, geohazard assessment, permafrost dynamics, and climate studies. However, delineating their boundary remains challenging because rock glaciers often occur alongside or merge with other geomorphic equifinal landforms that are difficult to distinguish by their spectral identity in aerial or satellite imagery. Additionally, their boundaries are inherently ambiguous, evolving with changes in topographic and climatic factors. The widely used approach involving manual digitisation through visual interpretation of geomorphic features is time-consuming and subjective. Recent studies have explored deep learning as a means to automate and scale up rock glacier mapping, but existing studies still remain limited in number and geographic scope, with minimal attention to evaluating discrepancies or uncertainties in mapped extents.  This study examines the use of a U-Net deep learning model for automated delineation of rock glacier extent, with particular emphasis on associated uncertainties. Using data from the Chile National Glacier Inventory for the Coquimbo region, we trained the model under two strategies: (1) differentiated training based on rock glacier types. A set of models was trained exclusively on landforms with clearly expressed geomorphological features of frontal slopes, lateral margin, and ridge-furrow structures, while another set incorporated all inventoried rock glaciers, including both well-expressed and subdued geomorphological features; (2) different predictor combinations, comparing a configuration that used only RGB + NIR bands from Sentinel 2 or PlanetScope imagery with an expanded set that integrated these spectral bands with DEM derivatives and imagery-derived variables.  The highest-performing models from these strategies were then applied to an independent test area, and their outputs were compared against existing inventories to evaluate spatial consistency and assess potential mapping biases. By integrating an automated method with uncertainty assessment, this work contributes to the ongoing advancement of rock glacier detection and delineation methods and highlights the critical need to validate deep learning outputs. Such uncertainty quantification is essential for ensuring the robustness of mapped extents and for supporting applications that depend on accurate and reliable representations of landforms with inherently ambiguous and dynamic boundaries.

How to cite: Tamang, S., MacDonell, S., Brasington, J., Shulmeister, J., and Robson, B. A.: Mapping the Margins: Evaluating Accuracy and Ambiguity in Automated Rock Glacier Delineation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1379, https://doi.org/10.5194/egusphere-egu26-1379, 2026.

EGU26-3470 | ECS | Posters on site | CR6.1

Mapping “cold spots” of potential hidden alpine permafrost using semi-supervised machine learning 

Yaniv Goldschmidt, Jacopo Boaga, and Francesco Marra

Geophysical techniques revealed frozen ground within relict periglacial landforms in which the presence of ice was excluded by traditional geomorphological and topographic approaches. These unexpected frozen bodies, referred to here as cold spots, suggest that permafrost can exist outside traditionally mapped permafrost zones. Under climate change, with retreating glaciers and increasing snow variability, subsurface ice in periglacial landforms becomes a potentially important but overlooked water resource. However, its spatial distribution and climatic controls remain poorly understood.

Here, we develop a methodology to identify cold spots. We focus on the Southern Alps and we assume that cold spots are related to micro-climatic and topographic conditions that allow permafrost to persist. We use a limited set of sites investigated by geophysical surveys, including confirmed cold spots and geomorphologically similar control sites without permafrost. We analyze topographic and climatic remote-sensing data to derive relevant features and examine their relation to cold spots. We then use these features in semi-supervised machine learning classification models to identify areas with conditions similar to known cold spots. The resulting maps highlight potential cold-spot locations targeted for forthcoming geophysical field investigations and provide a practical framework for improving the detection of hidden permafrost.

How to cite: Goldschmidt, Y., Boaga, J., and Marra, F.: Mapping “cold spots” of potential hidden alpine permafrost using semi-supervised machine learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3470, https://doi.org/10.5194/egusphere-egu26-3470, 2026.

EGU26-7803 | ECS | Orals | CR6.1

Self-supervised learning reduces labelling requirements for sea ice segmentation in Sentinel-1 SAR imagery  

Jacob Seston, William D. Harcourt, Georgios Leontidis, Brice Rea, Matteo Spagnolo, and Lauren McWhinnie

Monitoring Arctic sea ice variability is crucial for maritime safety. Synthetic Aperture Radar (SAR) imagery provides an effective means of achieving this through all-weather, day-and-night coverage of the Arctic. Navigation in the Canadian Arctic Archipelago currently relies on operational ice information services, including analyst-derived ice charts, satellite imagery, and ice routing products provided by national ice services However, the development of machine-learning systems capable of automatically processing large volumes of satellite imagery and accurately identifying ice conditions is constrained by the need for extensive manually labelled datasets. To address this limitation, we developed a self-supervised learning (SSL) approach, which uses unlabelled data to learn general image representations. Specifically, we use Bootstrap Your Own Latent (BYOL), a non-contrastive SSL framework, to pretrain a UNet encoder on unlabelled dual-polarised Sentinel-1 Extra-Wide mode (EW) SAR scenes before fine-tuning with a small set of labelled images. We compare the BYOL-pretrained UNet (called UNet SSL in this study) to four baselines: a control UNet, a fully supervised UNet, a Random Forest classifier, and the Segment Anything Model (SAM). With only three labelled scenes, the BYOL-pretrained UNet achieved higher segmentation accuracy than the fully supervised model trained on seven images, more than twice the number of labelled scenes. The most significant gains occurred in Marginal Ice Zone (MIZ) scenes, where the BYOL-pretrained UNet achieved a Matthews Correlation Coefficient  (MCC) of 0.2087, compared with 0.1685 for the fully supervised UNet trained on seven labelled scenes and 0.1449 for the control model trained on three scenes—representing an MCC increase of approximately 24% and 44%, respectively. These improvements were accompanied by a substantial reduction in false negatives and a marked increase in recall, indicating improved discrimination under low-contrast, fragmented floe conditions. Our findings demonstrate that SSL reduces annotation requirements for SAR-based sea ice segmentation, improving model generalisation in both consolidated and fragmented ice conditions. This approach offers a scalable solution to the labelling bottleneck in Arctic monitoring and highlights the potential of BYOL as a general pretraining strategy for SAR-based Earth observation image segmentation. 

How to cite: Seston, J., Harcourt, W. D., Leontidis, G., Rea, B., Spagnolo, M., and McWhinnie, L.: Self-supervised learning reduces labelling requirements for sea ice segmentation in Sentinel-1 SAR imagery , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7803, https://doi.org/10.5194/egusphere-egu26-7803, 2026.

EGU26-8748 | ECS | Posters on site | CR6.1

Seasonal Predictability of Antarctic Sea Ice based on Deep-learning Approach 

Gyeongmin Baek, Jiho Ko, Emilia Kyung Jin, and Jong-Seong Kug

There is a distinct difference in the behavior of sea ice extent response to global warming between the Arctic and Antarctic; the former is decreasing while the latter had been increasing slightly until recently. However, satellite data show that Antarctic sea ice has been continuously decreasing since 2016 and reached its minimum in February 2023. The minimization of sea ice extent in Antarctica would have various impacts on the Earth's system. Since ice is more reflective than liquid water, sea ice plays a significant role in maintaining the Earth’s energy balance. Therefore, it is crucial to accurately predict future sea ice response. Here, we aim to predict the sea ice extent for the upcoming season using deep learning models, employing U-Net. Atmospheric and oceanic data related to sea ice, such as sea surface temperature, wind speed, etc., were used as features, while the sea ice extent was set as the target. We trained and tested the models using data from the CESM2 Large Ensemble. We tarined the final model by fine-tuning the model pre-trained on numerical model data with observational data. The performance of the models was compared using ACC and RMSE as evaluation metrics. Additionally, to assess the impact of each variable within the model, we replaced each variable with its climatological mean and observed the changes in the evaluation metrics to determine their importance. These research findings are anticipated to significantly contribute to predicting more accurate changes in Antarctic sea ice and understanding future Antarctic sea ice changes.

 

How to cite: Baek, G., Ko, J., Jin, E. K., and Kug, J.-S.: Seasonal Predictability of Antarctic Sea Ice based on Deep-learning Approach, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8748, https://doi.org/10.5194/egusphere-egu26-8748, 2026.

EGU26-9225 | ECS | Orals | CR6.1

A machine-learning-based super-resolution approach for dynamical ice-sheet modeling 

Sebastian Scher, Andy Aschwanden, Florina Schalamon, Andreas Trügler, and Jakob Abermann

Dynamical ice-sheet models are among the primary tools used to investigate the evolution of ice sheets. However, their computational cost increases rapidly with spatial resolution, often making long-term or ensemble simulations prohibitively expensive. Here, we investigate whether recent advances in machine-learning-based super-resolution techniques for spatiotemporal data can be leveraged to reduce these computational costs while retaining high-resolution information.

Using one pair of low- and high-resolution simulations of the Greenland ice sheet for the 20th century, generated with the PISM dynamical ice-sheet model, we train a machine-learning-based super-resolution model to learn the mapping from low- to high-resolution states. For subsequent simulations, computationally inexpensive low-resolution model runs are combined with the trained super-resolution model to reconstruct high-resolution fields. We evaluate this hybrid framework by assessing (1) whether the super-resolution model can accurately reproduce the spatial details of high-resolution simulations, and (2) whether it can mitigate deficiencies in the long-term trends produced by low-resolution models. Our results provide insight into the potential of machine-learning-based super-resolution as a cost-effective tool for high-resolution dynamical ice-sheet modeling.

How to cite: Scher, S., Aschwanden, A., Schalamon, F., Trügler, A., and Abermann, J.: A machine-learning-based super-resolution approach for dynamical ice-sheet modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9225, https://doi.org/10.5194/egusphere-egu26-9225, 2026.

EGU26-10587 | ECS | Posters on site | CR6.1

Exploring machine-learning extrapolation of glacier elevation change in High Mountain Asia derived from ICESat-2 data 

Ying Huang, Lei Huang, and Tobias Bolch

Glacier elevation change is a fundamental measure for quantifying glacier mass balance and assessing glacier–climate interactions. Large-scale estimates are commonly derived either from satellite altimetry, which provides robust but spatially sparse measurements, or from digital elevation model (DEM) differencing, which enables spatially continuous mapping but is more sensitive to noise and bias in complex mountain terrain. Machine learning (ML) approaches have increasingly been used to bridge this gap by correcting or reconstructing elevation measurements using climate and topographic predictors. However, because ML-based prediction inherently involves extrapolation beyond directly sampled glaciers, its reliability across heterogeneous glacier systems such as existing in High Mountain Asia (HMA)remains poorly constrained.

In this study, we explore the behaviour of ML-based glacier elevation change predictions trained with ICESat-2 elevation measurements combined with climate and terrain variables across multiple HMA subregions. ICESat-2 footprints provide dense elevation change observations over only a limited subset of glaciers within each subregion. We train subregion-specific XGBoost models and evaluate their performance in relation to glacier sampling characteristics, feature importance, and elevation-dependent behavior.

The results reveal pronounced regional contrasts despite comparable glacier size and sample coverage across regions. In the Karakoram for example, ML-based extrapolation produces spatially coherent and elevation-dependent patterns of glacier elevation change, with predicted dh systematically decreasing from lower to higher elevations, consistent with expected glacier-scale behavior. These structured predictions are associated with robust model performance (R² ≈ 0.7). In contrast, in West Kunlun Shan, extrapolated elevation change fields are spatially uniform and weakly structured, showing little sensitivity to the applied climate and terrain predictors. These results indicate that the effectiveness of ML-based glacier elevation change modeling depends less on sample size or glacier extent alone than on the presence of stable and internally consistent response structures within glacier systems.

How to cite: Huang, Y., Huang, L., and Bolch, T.: Exploring machine-learning extrapolation of glacier elevation change in High Mountain Asia derived from ICESat-2 data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10587, https://doi.org/10.5194/egusphere-egu26-10587, 2026.

EGU26-12034 | ECS | Orals | CR6.1

Creating a high-resolution Northern Hemisphere daily SWE dataset (1980-2020) using Machine Learning 

Oriol Pomarol Moya, Derek Karssenberg, Walter W. Immerzeel, Philip Kraaijenbrink, Madlene Nussbaum, and Siamak Mehrkanoon

Snow water equivalent (SWE) is an important component of the global hydrological cycle, acting as a primary reservoir for seasonal water storage. Despite its relevance, only few datasets are available that provide long-term daily SWE estimates at global scale. Even amongst the best gridded SWE products, the spatial resolution does not go beyond 10km, a significant limitation considering the large spatial variability of snow. Furthermore, assimilation of snow observations in such products remains another key challenge. Machine learning (ML) models and their combination with process-based simulations, what is known as Hybrid Modelling, offer a promising alternative for producing detailed SWE predictions at large scales, given their high inference speed and adaptability to their training data. Hybrid ML models have already been used for SWE prediction over a small number of sites, improving both pure ML approaches and advanced process-based snow models such as Crocus, but their applicability for long-term spatiotemporal modelling of snow at larger scales remains to be tested.

In this project, we trained an LSTM model using in-situ snow data from roughly 10000 sites throughout the Northern Hemisphere with the aim of creating a 40-year gridded dataset of daily SWE at 1 km resolution. The model incorporates temperature, precipitation, and shortwave radiation as meteorological predictors, alongside a small set of topographic variables and land cover classification. Preliminary results show a good fit to stations excluded from the training set, with an RMSE of 44 mm, where unequal distribution of observation locations was accounted for by a weighting scheme. These findings suggest the suitability of this approach for extending coverage to ungauged regions across the Northern Hemisphere. The use of the ERA5-Land SWE product as a hybrid support promises further improvements in model performance.

Ultimately, this project aims to provide a finer-scale alternative to existing daily SWE products. By improving the spatial resolution to 1km and incorporating available snow measurements, it contributes to a more refined view of seasonal snow storage across the Northern Hemisphere.

How to cite: Pomarol Moya, O., Karssenberg, D., Immerzeel, W. W., Kraaijenbrink, P., Nussbaum, M., and Mehrkanoon, S.: Creating a high-resolution Northern Hemisphere daily SWE dataset (1980-2020) using Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12034, https://doi.org/10.5194/egusphere-egu26-12034, 2026.

EGU26-12303 | Posters on site | CR6.1

Hybrid training for robust emulator-based ice-thickness inversion 

Sebastian Rosier, Thomas Gregov, Brandon Finley, Guillaume Jouvet, and Andreas Vieli

Ice-flow inversions aim to infer unobserved controls on glacier and ice-sheet dynamics from limited, noisy surface data but are notoriously ill-posed: multiple parameter fields can reproduce the same observations, solutions are sensitive to priors/regularization, and model nonlinearity amplifies both data and structural errors. Here we target a particularly challenging variant — emulator-based inversion using a machine-learning surrogate for ice flow — where the forward operator is fast and differentiable but only an approximation of the governing physics. We focus on inverting for ice thickness, which remains poorly constrained for most glaciers yet strongly conditions driving stress, basal traction, and therefore hindcast skill and projection uncertainty.

We present emulator-based inversions with the Instructed Glacier Model (IGM), benchmarking against synthetic tests with known truth and contrasting performance with a full-physics ice-flow solver. IGM provides a PINN-based emulator trained by minimizing a energy representing the Blatter–Pattyn equations. This powerful approach has proven very successful in the forward problem but leads to an emulator that may need regular retraining to ensure an accurate solution. We show that this training approach can introduce surrogate error modes that distort gradients and create spurious minima, degrading convergence and reliability of gradient-based optimization used for the inverse problem. To address this, we introduce a hybrid training strategy that augments the physics loss with a data-misfit term against a large training set, with the aim of improving out-of-distribution generalization across glacier geometries. The resulting emulator yields more reliable recovery of unknown fields such as ice thickness and supports the fast, scalable inversions needed for ensemble modelling and robust uncertainty quantification.

How to cite: Rosier, S., Gregov, T., Finley, B., Jouvet, G., and Vieli, A.: Hybrid training for robust emulator-based ice-thickness inversion, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12303, https://doi.org/10.5194/egusphere-egu26-12303, 2026.

EGU26-13166 | ECS | Posters on site | CR6.1

A Random Forest and XGBoost analysis of the temporal and spatial variability of snow depth in the Zugspitze region based on terrain features, simulated energy balance and remote sensing 

Paul Schattan, Jakob Knieß, Simon Gascoin, Juilson Jubanski, Roberta Facchinetti, Carolin Rempfer, Karl-Friedrich Wetzel, Christian Voigt, Karsten Schulz, and Franziska Koch

Temporal and spatial patterns of snow depth are key predictors for variations in snow water equivalent and snow-hydrological processes. Observations of snow depth distribution are usually scarce either in space or in time. Automatic weather stations can measure snow depth continuously but only for one point with a very small footprint. Campaign based surveys, in contrast, cover larger areas but are limited in spatial coverage due to technical and logistical constraints.

The partial recurrence of snow depth patterns correlated with terrain features is well known. In this work a machine learning approach based on Random Forest and XGBoost is presented to analyze the temporal evolution of snow depth distributions in the Zugspitze region. Input features include elevation and derived terrain features like slope, aspect, curvature, topographic position index and Winstral wind shelter index. Furthermore, simulated energy balance sums and snow occurrence from optical remote sensing data are used. Snow depth data include terrestrial Lidar measurements and photogrammetric data based on airborne and spaceborne platforms including drones, airplanes and the Pléiades satellite constellation.

Interestingly, due to the specific topography of the area featuring a karstic plateau surrounded by steep slopes, no clear elevational gradients were found. Historical information constitutes a useful feature for machine learning but explains only parts of the variability as actual snow depth distributions are altered by wind drift and energy balance. This is reflected by a moderate temporal transferability of the trained machine learning models. Within the study domain, campaign specific machine learning models produce plausible results for areas with data gaps. While Random Forest and XGBoost produce similar results, differences between different sets of input features can be substantial. Meltout patterns based on remote sensing data can partly compensate for a lack of historical snow depth information.

Machine learning proves to be a suitable tool for closing spatial data gaps. The results also highlight the importance of a process-based choice of input features, as inter- and intraannual snow depth distributions differ even in a region with stable snow depth patterns.

How to cite: Schattan, P., Knieß, J., Gascoin, S., Jubanski, J., Facchinetti, R., Rempfer, C., Wetzel, K.-F., Voigt, C., Schulz, K., and Koch, F.: A Random Forest and XGBoost analysis of the temporal and spatial variability of snow depth in the Zugspitze region based on terrain features, simulated energy balance and remote sensing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13166, https://doi.org/10.5194/egusphere-egu26-13166, 2026.

EGU26-13443 | ECS | Orals | CR6.1 | Highlight

Icebergs as a Distributed Sensor Network: Iceberg Tracking in Time-Lapse Imagery for Fjord Circulation Analysis 

Marco Jaeger-Kufel, Anja Neumann, Andreas Vieli, Andrea Kneib-Walter, Ethan Welty, and Josefine Umlauft

Tidewater glaciers are critical gateways for global sea level rise, with their stability strongly influenced by complex fjord circulation patterns that control submarine melting. Direct observations of these dynamics with conventional oceanographic instruments remain challenging due to temporal or spatial constraints. However, the fjords themselves contain a distributed sensor network: icebergs. As passive tracers driven by currents, icebergs of different sizes respond to circulation at different depths due to their varying underlying drafts. Deriving quantitative circulation data from these tracers requires tracking hundreds to thousands of similar-looking icebergs simultaneously.

This work presents an automated multi-object-tracking framework that extracts dense, size-stratified velocity fields from time-lapse imagery, providing the observational foundation required to reconstruct depth-dependent circulation patterns within glacier fjords. We introduce a scale-adaptive object detection architecture based on Faster R-CNN that achieves 87.1\% detection recall and successfully captures a large fraction of the iceberg population from only a sparse set of manual labels. To maintain persistent identities in dense scenes, we employ a multi-modal association strategy that combines Kalman-filtered motion priors with appearance similarity learned via Vision Transformers. Evaluated across diverse environmental conditions, the framework demonstrates high stability with 95.7\% identity consistency (IDF1) at 2-minute time intervals and generalizes to unseen glaciers without retraining. By transforming time-lapse imagery into quantitative circulation records, this work provides a robust framework for monitoring the hidden ocean dynamics that drive glacier retreat.

How to cite: Jaeger-Kufel, M., Neumann, A., Vieli, A., Kneib-Walter, A., Welty, E., and Umlauft, J.: Icebergs as a Distributed Sensor Network: Iceberg Tracking in Time-Lapse Imagery for Fjord Circulation Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13443, https://doi.org/10.5194/egusphere-egu26-13443, 2026.

EGU26-14626 | ECS | Orals | CR6.1

From seismic signals to calving drivers: Assessing twelve years of glacial earthquakes in Greenland using Random Forest models 

Selina Wetter, Anne Mangeney, Eléonore Stutzmann, Clément Hibert, and Stuart N. Lane

Quantifying iceberg calving is important for understanding ice mass loss of the Greenland Ice Sheet and its subsequent impact on sea level rise, and for refining calving laws that currently represent a major source of uncertainty in global climate models. These calving events, particularly those involving capsizing icebergs, exert time-varying forces on the edge of marine-terminating glaciers that produce distinct seismic signals known as glacial earthquakes.

By processing twelve years of continuous seismic data and employing a Random Forest classifier to distinguish these glacial earthquakes from tectonic events, we generated a comprehensive catalogue of 6263 previously undocumented glacial earthquakes occurring between 2013 and 2024. The detected events are located along the Greenland coast with surface wave magnitudes ranging from MSW 4.1 to 5.4. They cluster at nine major calving glaciers, though the vast majority of activity is concentrated at Sermeq Kujalleq (Jakobshavn Isbræ) and Helheim Gletsjer.

To investigate the driving mechanisms behind these events, we analysed the correlation between calving activity and various environmental variables, including glacier velocity, air temperature, sea ice fraction, sea surface temperature, and wind speed. We train a second Random Forest model to predict monthly calving events and evaluate the relative importance of these environmental features, while applying statistical analyses to investigate correlations on a yearly basis where data points are limited. Our results indicate that the relationship between calving and the environment is highly complex and site-specific, as no single variable serves as a universal driver for all glaciers.

This complexity is further highlighted by scale-dependent correlations between calving events and environmental variables. For instance, while the glacier velocity shows a strong correlation with cumulative yearly calving at Sermeq Kujalleq, its importance diminishes on a monthly scale. Conversely, Helheim Gletsjer exhibits no clear yearly correlation with the glacier velocity, highlighting the site-specific nature of calving dynamics. We will present the spatio-temporal evolution of the detected events and discuss how these diverse environmental correlations quantify the varying sensitivity of individual glaciers to environmental forcing across different temporal scales.

How to cite: Wetter, S., Mangeney, A., Stutzmann, E., Hibert, C., and Lane, S. N.: From seismic signals to calving drivers: Assessing twelve years of glacial earthquakes in Greenland using Random Forest models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14626, https://doi.org/10.5194/egusphere-egu26-14626, 2026.

EGU26-15167 | Posters on site | CR6.1

Quantifying the uncertainty of regime shifts in the paleoclimate via physics-informed emulation and expert knowledge 

Dimitra Salmanidou, Lauren Gregoire, Brooke Snoll, Charli Frisby, Matt Graham, and Serge Guillas

The absence of data in the existing instrumental record significantly limits our ability to comprehend and forecast tipping points in the Greenland Ice Sheet (GrIS) and Subpolar Gyre (SPG). Multidirectional approaches are therefore required to capture the complexity of systemic changes and support future early warning efforts. In this study we discuss the ongoing work of the research project VERIFY: Out Of Sample Testing For Early Warning Systems Using Past Climate. We combine computational experiments, with physics-informed emulation and insights from expert elicitation to better understand dirvers of paleoclimate regime shifts in the GrIS. Employing uncertainty quantification methods, we make use of machine learning surrogate models to approximate the system's response. Surrogate models can accurately mimic input-output relationships of complex and computationaly expensive models, providing the opportunity to produce large ensembles for fully exploring the range of plausible model inputs. The goal is to understand what drives the exceedance of critical thresholds through the integration of computational experiments, machine learning and current scientific knowledge.

How to cite: Salmanidou, D., Gregoire, L., Snoll, B., Frisby, C., Graham, M., and Guillas, S.: Quantifying the uncertainty of regime shifts in the paleoclimate via physics-informed emulation and expert knowledge, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15167, https://doi.org/10.5194/egusphere-egu26-15167, 2026.

EGU26-16321 | ECS | Posters on site | CR6.1

A Lightweight Hybrid CNN–Transformer Architecture for High‑Resolution Downscaling and Bias Correction of Snow Water Equivalent 

Mubashshir Ali, Farid Ait-Chaalal, Siddharth Kumar, and Alison Dobbin

Accurate, high‑resolution Snow Water Equivalent (SWE) information is critical for reliable hazard assessment and effective water resource management. Yet, widely used global reanalysis products provide SWE at coarse spatial scales and exhibit substantial terrain‑ and melt‑related biases. In contrast, dynamically downscaled products offer improved detail but are costly to run and thus, remain limited in availability.

To address this limitation, we introduce the Linear Attention Snow Downscaling Model (LASDM), a lightweight hybrid deep learning architecture designed specifically to enhance the spatial detail and physical realism of SWE fields. LASDM combines convolutional neural networks with linear attention based transformer blocks, enabling efficient representation of synoptic‑to‑local snow processes while remaining highly parameter‑efficient (<1 million parameters).

Applied to the ERA5 → ERA5‑Land downscaling problem over the Great Lakes region (1980–2022), LASDM demonstrates stronger performance than U‑Net, Swin Transformer, and statistical baselines across a range of evaluation metrics. Case studies for two winter storms provide additional context for these differences. More broadly, this work suggests the potential of machine‑learning architectures for downscaling and bias correction. LASDM offers a compact and adaptable framework that may help improve snow representation and support applications that rely on higher‑resolution SWE.

How to cite: Ali, M., Ait-Chaalal, F., Kumar, S., and Dobbin, A.: A Lightweight Hybrid CNN–Transformer Architecture for High‑Resolution Downscaling and Bias Correction of Snow Water Equivalent, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16321, https://doi.org/10.5194/egusphere-egu26-16321, 2026.

EGU26-16558 | ECS | Posters on site | CR6.1

Enhancing sea ice concentration prediction with multi-task learning and conditional residual refinement 

Woohyeok Kim, Inchae Chung, Ga-ryung Lee, Minki Choo, and Jungho Im

Sea ice covers the oceans in polar regions and is closely related to heat circulation between the Sun and the Earth, and the absolute amount of sea ice can represent climate change itself. The research aiming to prepare for future climate conditions by predicting sea ice concentration (SIC), the area ratio of the ocean covered by sea ice, is being actively conducted.

From a long-term perspective, sea ice is influenced by sea surface temperature (SST), 2 m air temperature (t2m), wind fields, and so on, and machine learning or deep learning techniques are used to predict SIC in order to leverage the correlations among variables. However, due to the characteristics of deep learning techniques, there are limitations in identifying how much each variable influences the SIC prediction results.

This study simultaneously predicts SIC, t2m, and SST through a multitask Transformer model, and the predicted t2m and SST are converted into a gate intensity map to correct the bias of SIC. Through this, we interpreted how atmospheric and oceanic environmental factors affected the SIC prediction results. In addition, by comparing the prediction results of SIC and environmental factors under conditions such as specific seasons and regions, where prediction is relatively unstable, we quantified the variable-specific weights under those conditions.

The gate intensity map used for SIC bias correction can itself be used as an uncertainty map, and expresses, as a spatial distribution, regions that are difficult for the deep learning model to predict. In addition, by comparing the impacts of each environmental factor by lead time, the contributions of variables can be identified at long-term and short-term prediction time points.

How to cite: Kim, W., Chung, I., Lee, G., Choo, M., and Im, J.: Enhancing sea ice concentration prediction with multi-task learning and conditional residual refinement, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16558, https://doi.org/10.5194/egusphere-egu26-16558, 2026.

EGU26-18020 | ECS | Posters on site | CR6.1

Global-scale modelling of mountain glacier evolution since the last interglacial 

Sjur Barndon, Augusto C. Lima, David M. Chandler, Abe Theodorus Wiersma, Eline Sterre Rentier, Raúl Prats Prats, and Suzette G.A Flantua

For most glaciated areas, detailed mountain glacier evolution since the last interglacial is largely unknown. Due to limitations of traditional numerical modelling, previous studies have typically operated at a coarse spatial resolution, limited study area size, or focused on major climate events. Here we address these limitations by applying the Instructed Glacier Model (IGM), a deep-learning ice-flow emulator enabling efficient GPU-accelerated transient simulations. We model mountain glacier evolution since 130 ka and up to the present-day at 500 m resolution across eight broad mountain ranges in North America, South America, Eurasia, and Africa. Paleoclimate variables are approximated and regionalised using a combination of global and regional climate proxy datasets. We perform 617 parameter-calibration experiments varying paleoclimate and ice-dynamic parameters, with an average runtime of 21 hours per experiment. Model performance is assessed using combined areal and volumetric validation at two known glacial states; the last glacial maximum and the present-day. We also introduce a reproducible probabilistic model-evaluation framework combining confusion matrix validation score and simulation rank to identify sets of acceptable model parameters rather than a single best-fit solution. Our results show that IGM can model realistic ice-flow patterns, glacier geometries, and transient evolution across full glacial-interglacial cycles, demonstrating that machine-learning models of ice dynamics generalise to new domains and conditions, although performance can decline at coarser spatial resolutions. Together, these results demonstrate the feasibility of global-scale, high-resolution, transient glacier modelling over orbital timescales using a deep learning instructed model, while providing a 100-year interval dataset including glacier extent, ice thickness, and ice flow patterns for the last glacial cycle.  

How to cite: Barndon, S., Lima, A. C., Chandler, D. M., Wiersma, A. T., Rentier, E. S., Prats, R. P., and Flantua, S. G. A.: Global-scale modelling of mountain glacier evolution since the last interglacial, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18020, https://doi.org/10.5194/egusphere-egu26-18020, 2026.

EGU26-18070 | ECS | Posters on site | CR6.1

Exploring Geo Foundation Models for Glacier Mapping Using Remote Sensing Data 

Farzaneh Barzegar, Norbert Kuehtreiber, and Silvia L. Ullo

Geo foundation models (GFMs) have recently emerged as a new paradigm in Earth observation (EO). They provide a promising approach for enhancing remote sensing analysis. GFMs enable faster and more generalised applications. They are deep learning models trained on large unlabelled datasets to learn general spatial, spectral, and contextual representations of the Earth’s surface. The datasets used are usually diverse in location, seasons, and even sensors. This diversity ensures that the model learns features that are as general as possible. This is vital because labelled data in remote sensing are limited, while high-quality unlabelled data are widely accessible. As a result, GFMs are increasingly viewed as a promising tool for scalable and robust environmental monitoring.

Among various EO tasks, glacier mapping is particularly relevant in the context of GFMs. Glaciers are located in hardly accessible regions, which makes ground-truth (GT) preparation difficult. Delineation of glaciers is often affected by seasonal snow and regional variability. Moreover, debris-covered and rock glaciers are harder to detect due to their complex landforms and their similarity to surrounding terrain. Accurate glacier delineation is crucial for monitoring cryospheric changes, assessing climate change impacts, managing water resources, and mitigating natural hazards.

In this study, we explore the applicability of GFMs for glacier mapping using multispectral Sentinel-2 imagery. We apply fine-tuning of pre-trained GFMs for glacier delineation, with the aim of assessing their potential in comparison with traditional deep learning approaches.

How to cite: Barzegar, F., Kuehtreiber, N., and L. Ullo, S.: Exploring Geo Foundation Models for Glacier Mapping Using Remote Sensing Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18070, https://doi.org/10.5194/egusphere-egu26-18070, 2026.

EGU26-19235 | ECS | Posters on site | CR6.1

Random forest predictions of tundra snow density elevate Arctic soil temperatures in CLM5.0 

Jonathan Rutherford, Nick Rutter, Leanne Wake, Georgina Woolley, Julia Boike, and Alex Cannon

Arctic snow exerts a critical control on winter soil temperature and carbon exchange, however representation of its properties in Earth System Models (ESMs) remains simplified. In the Community Land Model v5.0 (CLM5.0), recent updates to snow compaction schemes have led to overly dense tundra snow and excessive conductive heat loss, producing a persistent cold-soil bias. Here we developed a Random Forest (RF) regression model to derive tundra snow density from meteorological variables, trained on Arctic SVS2-Crocus (ASC) simulations supported by in-situ observations collected around peak annual SWE from Trail Valley Creek (TVC), Northwest Territories, Canada. The RF model reproduces ASC-simulated density evolution with a mean absolute error of 23 kg m-3 and an R2 of 0.90, matching field measurements more closely than CLM5.0. Future snow density predictions using the RF model driven by bias-corrected NA-CORDEX meteorology (2016 – 2100) indicate bulk snow densities 200 – 450 kg m-3 lower than CLM5.0 and more consistent with tundra conditions. Application of RF-derived snow densities decreases CLM5.0 winter season 10cm soil temperature RMSE by approximately 2 – 3 °C relative to field measurements (2017 – 2023) and increases future winter soil temperature projections (2016 – 2100) by 4 – 7 °C, highlighting the strong sensitivity of CLM5.0’s soil thermal regime to snow physical properties.

How to cite: Rutherford, J., Rutter, N., Wake, L., Woolley, G., Boike, J., and Cannon, A.: Random forest predictions of tundra snow density elevate Arctic soil temperatures in CLM5.0, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19235, https://doi.org/10.5194/egusphere-egu26-19235, 2026.

EGU26-20480 | ECS | Orals | CR6.1

Deep learning based spatio-temporal forecasting of snow cover in the Alps using remote sensing data  

Samip Narayan Shrestha, Andreas Dietz, Sarah Leibrock, and Claudia Kuenzer

Snow cover is a critical component of the earth’s climate and weather system, which exhibits high spatial and temporal variability. Therefore, we predict daily snow cover using spatio-temporal forecasting. Unlike traditional forecasting approaches, that require spatial or temporal aggregation, our approach employs deep learning models specifically designed for spatio-temporal data.  Spatio-temporal predictive learning has primarily focused on nowcasting and sub-seasonal forecasts at a daily scale with a lead time up to 15 days. However, we implemented the capability of using such models with long multi-year satellite image time series to predict at a daily scale annually, specifically for daily snow cover. In our research, we use the historical snow cover data from the DLR Global SnowPack remote sensing product, which is a daily cloud free 500m snow cover representation on the ground. We generate high resolution daily snow cover forecasts for up to 365 days (one year ahead) beginning from 1st July. We implemented models such as Convolutional Long Short-Term Memory (ConvLSTM) networks, convolutional encoder decoder architectures with attention mechanisms, and Vision Transformer (ViT) based models and adapted them for our use case. To further enhance our predictions, we also made adaptations to the models to include multivariate spatial and temporal data which are key drivers of snow cover variability into the model. Topographical feature maps derived from elevation, and time series of climatological indices (atmospheric oscillation patterns) are two examples. Validation against reference data demonstrates exceptional accuracy and F1-scores exceeding 84% across forecasts.

How to cite: Shrestha, S. N., Dietz, A., Leibrock, S., and Kuenzer, C.: Deep learning based spatio-temporal forecasting of snow cover in the Alps using remote sensing data , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20480, https://doi.org/10.5194/egusphere-egu26-20480, 2026.

EGU26-20684 | Posters on site | CR6.1

A deep learning-based emulator of the regional atmospheric model MAR for estimation of the Antarctic surface mass balance 

Achille Gellens, Cécile Agosta, Mikel N. Legasa, Mathieu Vrac, Charles Amory, and Christoph Kittel

Faithful modeling of Antarctic climate relies on capturing polar-specific processes at high spatial resolution (~10-30 km). Most CMIP earth system models (ESMs) used for climate projections inadequately represent physical processes that are key drivers in polar climates, and operate at resolutions too coarse to resolve them. Polar-oriented regional climate models (RCMs) are considered the state-of-the-art in modeling the atmosphere at high latitudes, where air-snow interactions are critical, but they are costly to run. This limits their use for exploration of large ensembles and scenarios as well as their potential of integrating into a coupled modeling pipeline.

In order to address these limitations, we develop an affordable surrogate model, or emulator, of the polar-oriented Modèle Atmosphérique Régional (MAR), using deep learning and a variant of the commonly used U-Net convolutional neural network architecture. The emulator is trained to predict 35 km-resolution daily maps of surface mass balance (SMB) components—snowfall, rainfall, run-off and sublimation—over the Antarctic ice sheet from large-scale atmospheric fields of ESMs, effectively learning the downscaling function embedded in MAR. To achieve this, we use a dataset composed of MAR simulations forced by 4 CMIP ESMs over the 1980–2100 period, covering SSPs 1-2.6, 2-4.5 and 5-8.5. We conduct different experiments to assess its best-case performance as well as its transferability to unseen scenarios and ESMs.

The emulator demonstrates strong in-domain skill, displaying high fidelity in reproducing both day-to-day and spatial synoptic variability of the predicted quantities. Long-term SMB trends and interannual variability through 2100 are also well-replicated, with predicted integrated surface mass change over the 1980–2100 period differing by only 1% from MAR. We find that the emulator is robust against unseen emission scenarios, with marginal increase of up to few percent in RMSE. Transferability to other ESMs proves more challenging but results remain promising.

The MAR emulator can be used to generate SMB forcings for ice-sheet models at a negligible computational cost compared to RCMs, allowing century-scale simulations to be produced within minutes and thereby enabling the exploration of a wide range of scenarios and ensemble members. We suggest the general framework of this work could allow for the emulation of MAR in any application where it can be traditionally used. Ongoing work is also investigating the applicability of the emulator within an atmosphere–ice sheet coupled framework.

How to cite: Gellens, A., Agosta, C., N. Legasa, M., Vrac, M., Amory, C., and Kittel, C.: A deep learning-based emulator of the regional atmospheric model MAR for estimation of the Antarctic surface mass balance, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20684, https://doi.org/10.5194/egusphere-egu26-20684, 2026.

EGU26-22268 | Posters on site | CR6.1

Bayesian Optimisation for Antarctic Survey Planning 

Kim Bente, Roman Marchant, and Fabio Ramos

Remoteness, harsh environmental conditions, short field seasons, and high operational costs severely constrain the ability to collect observations of the polar cryosphere at scale. These limitations make efficient survey planning an important methodological need: data acquisition strategies must prioritise measurement locations that simultaneously (i) reduce model uncertainty and (ii) maximise scientific utility, for example by tightening constraints on projected ice sheet contributions to sea level rise. We address this need with Bayesian optimisation (BO), a probabilistic machine learning framework for black-box optimisation that uses a Gaussian process surrogate to model the target geospatial field and an acquisition function to formalise the trade-off between uncertainty reduction and scientific utility when proposing subsequent measurement locations. To showcase the approach, we consider a case study on planning airborne geophysical surveys of Antarctic ice thickness and bed topography, for which we introduce a set of novel acquisition functions tailored to Antarctic ice dynamics that translate cryospheric objectives into the BO framework:

  • The FluxUCB (Flux Upper Confidence Bound) acquisition function incorporates satellite-derived ice velocity observations to prioritise sampling uncertain, potentially high-flux regions under the current posterior, since such regions can exert a disproportionate influence on ice discharge.
  • Alternatively, PBBS (Probability of Bed Below Sea level) prioritises locations with a high posterior probability of marine-based grounding, thereby focusing effort on areas most relevant to assessing marine ice sheet instability (MISI).

In simulation, these objectives reduce posterior uncertainty per flight hour more efficiently than baseline strategies and more consistently target scientifically consequential regions. Together, these acquisition functions illustrate how BO can translate scientific priorities into an uncertainty-aware decision framework for data-efficient polar observation campaigns. More broadly, the framework has strong potential to extract greater value from limited polar field resources beyond airborne surveys, from optimising seismic survey design to informing ice core drilling site selection.

How to cite: Bente, K., Marchant, R., and Ramos, F.: Bayesian Optimisation for Antarctic Survey Planning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22268, https://doi.org/10.5194/egusphere-egu26-22268, 2026.

ESSI2 – Data, Software and Computing Infrastructures across Earth and Space Sciences

EGU26-1872 | ECS | Posters on site | ESSI2.2

dc_toolkit: A parallelized pipeline to navigate the complex ecosystem of compression algorithms 

Nicoletta Farabullini and Christos Kotsalos

As Earth System Sciences (ESS) datasets from high-resolution models reach petabyte scales, the scientific community encounters severe constraints in storage, transfer efficiency, and data accessibility. Identifying the right parameters for high compression ratios with strict scientific fidelity within the vast ecosystem of lossy and lossless compression algorithms is a complex and delicate technical challenge.

We present dc_toolkit (https://github.com/C2SM/data-compression): an open-source, parallelized pipeline designed to support researchers navigate through this complex landscape. It leverages a set of user-friendly, customizable command-line tools to allow users to make informed, data-driven decisions. By systematically evaluating over 40,000 combinations of compressors, filters, and serializers, it autonomously identifies the most suitable configuration for both structured and unstructured data with single or multiple variables.

The workflow comprises a three-stage approach: (1) Evaluation & Optimization: the toolkit leverages parallel processing (via Dask and mpi4py) to rapidly evaluate combinations while filtering out those that violate scientific precision requirements and user-defined error tolerances (L-norms). (2) Analysis & Visualization: to help scientists analyze the trade-offs between data reduction and information loss, the tool performs k-means  clustering on the outputs to display clear and organized results. Furthermore, it provides spatial error plotting to verify that domain-specific features (such as periodicity in global grids) are preserved. (3) Application & Interoperability: once the user has decided on a specific configuration, the toolkit handles the high-throughput compression of the dataset into Zarr-based storage. It ensures seamless integration into existing workflows by including utilities for a variety of actions such as inspecting compressed files and converting compressed data back to standard NetCDF format.

By providing a streamlined, automated, and verifiable method for selecting compression parameters, dc_toolkit lowers the entry barrier for lossy compression. It allows ESS researchers to more easily apply data reduction strategies with the confidence that the integrity of their downstream analysis remains intact. Accessibility is further enhanced through available web-based tools and GUI implementations for diverse user technicalities.

How to cite: Farabullini, N. and Kotsalos, C.: dc_toolkit: A parallelized pipeline to navigate the complex ecosystem of compression algorithms, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1872, https://doi.org/10.5194/egusphere-egu26-1872, 2026.

EGU26-5035 | ECS | Posters on site | ESSI2.2

Parallel file access: the missing piece in efficient large scale geosimulation 

Junxian Chew and Kor de Jong

Forward simulation of geographical systems typically involves time step iterations of reading, compute and writing temporal states until the target end time. As the spatial fidelity of geographical data continues to be refined to achieve simulations with higher accuracy, so does the amount of operations associated with read, compute and write within each time step. Simulations of continental or global scale can only be completed within a reasonable time scale if the data can be distributed over multiple supercomputer nodes, in conjunction with parallel execution for the operations within each time step. 

The LUE framework is designed to be a general software platform that enables scientists in defining custom computational models and achieving scalable performance on large-scale computing environment. There have been some efforts in the parallel implementations of compute operations which demonstrated good scaling behaviour[1,2]. This is achieved in LUE by distributing small subsets of the global geographical dataset to available CPU threads across multiple supercomputer nodes in an asynchronous manner, each subset having its own set of compute operations to be executed. The asynchronicity of the workload queueing allows large number of subsets to be processed in parallel, as well as ensuring full workload occupancy to all available compute resources.

This advancement, however, inadvertently highlighted the inefficiency of serial handling of read/write operations. File access operations like read/write is also known as Input/Output (I/O) operations. Just as scalable computation requires parallel algorithms to realize, scalable I/O requires the utilization of parallel I/O libraries to distribute the I/O workload over multiple I/O-specific compute nodes. However, combining parallel I/O with asynchronously spawned computations, while ensuring that the resulting file output is correct is challenging. 

The challenge originates from complexities in ensuring data in memory is synced to the file storage system while the storage system is being acted on by all participating CPU threads. Often times, careless management of I/O results in unintended overwriting of file content due to concurrent accesses. This highlights the added difficulties in file access parallelization compared to in-memory operations such as computations. As such, much care is needed in the design and planning of file access and synchronisation patterns for meaningful gain in parallel I/O performance within an asynchronous many task execution. 

In this work, we attempt to implement a parallel read/write access pattern that works well with the asynchronous parallel compute paradigm deployed within the LUE modelling framework. Integration of parallel I/O in an asynchronous execution brings additional benefit of interleaved compute and I/O tasks. Part of the I/O latencies can be hidden by concurrent compute workloads, which is harder to realize in a synchronous parallel execution. Success of this work will enable scalable compute and parallel file access for geoscience simulation workloads carried out via the LUE framework, reducing the overall computational resource consumption for large scale simulations. 

References:
1. https://doi.org/10.1016/j.cageo.2022.105083
2. https://doi.org/10.1016/j.envsoft.2021.104998

How to cite: Chew, J. and de Jong, K.: Parallel file access: the missing piece in efficient large scale geosimulation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5035, https://doi.org/10.5194/egusphere-egu26-5035, 2026.

EGU26-6903 | ECS | Posters on site | ESSI2.2

Evaluating a meteorological downscaling method for volcanic ash dispersion and deposition modelling 

Carlos Villalta López, Leonardo Mingari, Alexandros-Panagiotis Poulidis, and Arnau Folch

High-resolution meteorological data is essential for accurate volcanic ash dispersion modelling, particularly in regions with complex topography. However, performing fully dynamical atmospheric simulations at very fine spatial resolution is computationally expensive and may limit their applicability in contexts where urgent computing is required, such as operation forecasting. Diagnostic downscaling methods offer a potential alternative by enhancing coarse-resolution meteorological fields at a lower computational cost, but their added value relative to full dynamical nesting remains to be further explored. In this work, we assess the effectiveness of diagnostic meteorological downscaling using an integrated simulation workflow based on the MetPrep tool coupled with the FALL3D ash dispersion model. This approach is applied to the case study of the 2021 Tajogaite eruption (La Palma), comparing meteorological data from three WRF-ARW dynamically nested domains with increasing spatial and temporal resolution (domains d01, d02 and d03) against diagnostic downscaling applied to the coarser WRF domains (d01+MetPrep and d02+MetPrep). All dispersion simulations are run using identical eruptive parameters in order to isolate the impact of the meteorological downscaling method. The simulated ash deposits are compared against field observations using point-to-point validation metrics and spatial characterisation based on isopach area fits. In addition, physically motivated wind metrics, including vertical wind shear and wind-topography coherence, are analysed to interpret the effects introduced by diagnostic downscaling on the flow. Preliminary results show that diagnostic downscaling can partially bridge the gap between coarse and high-resolution dynamical simulations, improving the representation of near-surface flow and ash deposition patterns at a fraction of the computational cost. The study highlights both the potential and the limitations of diagnostic downscaling as an alternative to full dynamical nesting for volcanic ash dispersion applications.

Funded by the European Union. This work has received funding from the European High Performance Computing Joint Undertaking (JU) and Spain, Italy, Iceland, Germany, Norway, France, Finland and Croatia under grant agreement No 101093038, ChEESE-2P, project PCI2022-134973-2 funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR.

How to cite: Villalta López, C., Mingari, L., Poulidis, A.-P., and Folch, A.: Evaluating a meteorological downscaling method for volcanic ash dispersion and deposition modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6903, https://doi.org/10.5194/egusphere-egu26-6903, 2026.

EGU26-7902 | Posters on site | ESSI2.2

Towards standardised metrics of performance, energy and carbon footprint for CMIP experiments 

Mario Acosta, Sergi Palomas, Sophie Valcke, Pierre-Antoine Bretonnière, and Paul Smith

Global climate models are among the most computationally demanding scientific applications, with rapidly increasing resolution and complexity driving unprecedented requirements in high-performance computing. While model intercomparison efforts have traditionally focused on scientific output and physical fidelity, the computational performance, energy consumption and carbon footprint of climate simulations are becoming critical factors for the sustainability of next-generation modelling activities.

Building on previous coordinated work done for CMIP6, this work extends the scope towards a global assessment framework applicable to all major climate models. We present a list of metrics applicable to climate simulations to systematically quantify model performance, energy cost and associated carbon footprint using standardised and reproducible metrics across supercomputing platforms. The proposed framework combines workload analysis, runtime monitoring and workflow-level instrumentation to enable consistent comparisons between modelling systems.

This effort is conducted in the context of the World Climate Research Programme ESMO Infrastructure Panel (WIP), where a dedicated task team is coordinating the systematic collection of performance, energy and carbon footprint metrics from modelling centres participating in CMIP7, in collaboration with initiatives such as ESiWACE, ENES-RISe,  Destination Earth and FUTURA. The objective is to establish community-endorsed metrics and monitoring practices that can be integrated into operational model development and production workflows, from CMIP7 and beyond.

By treating computational efficiency and carbon footprint as first-class metrics in climate model evaluation, this work aims to support informed decisions on model design, resource allocation and optimisation strategies, contributing to a more efficient and sustainable future for global climate modelling.

How to cite: Acosta, M., Palomas, S., Valcke, S., Bretonnière, P.-A., and Smith, P.: Towards standardised metrics of performance, energy and carbon footprint for CMIP experiments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7902, https://doi.org/10.5194/egusphere-egu26-7902, 2026.

EGU26-9317 | Posters on site | ESSI2.2

Simulating flood dynamics on dynamic HPC resource sets 

Daniel Caviedes-Voullième, Pablo Vallés, José Segovia-Burillo, Mario Morales-Hernández, Sergio Iserte, and Antonio Peña

Flood dynamics are transitions between low-flow stages which result in small wet areas and high-flow stages which naturally result in large flooded areas. The response of the dynamics of a flood to the time-varying forcing (may it be a hydrograph or precipitation) is precisely what flood models attempt to simulate. Therefore, it is a priori unknown. 

The computational load of 2D shallow water simulators is strongly dependent on the number of flooded cells, and thus the flooded area. Consequently, the dynamics of the flooded area translates into time-varying computational demands: low flow stages can be simulated with fewer resources, whereas peak-flow stages demand significantly higher computational capacity. Typically, modellers will choose a set of computational resources which suits the problem size and demands based on experience and preliminary tests. However, these static (used throughout the simulation) resource sets either slow down computations when they are too small for the high flow stages, or make inefficient use of resources when they are too large for the low flow stages. It follows that dynamic resource allocations, based on the computational demands, would be optimal.

In this contribution we present the integration of the SERGHEI-SWE hydrodynamic model with the Dynamic Management of Resources library (DMRlib) to enable malleability —i.e., the runtime adjustment of MPI process counts and computational resources— to improve computational efficiency in shallow-flow simulations. By coupling SERGHEI-SWE with DMRlib, we enable the solver to dynamically expand or shrink its resource set during execution, adapting to these changing computational needs based on minimal heuristics.

SERGHEI-SWE is a high-performance, exascale-ready, scalable shallow water solver supporting CPUs and GPUs. DMRlib extends it with lightweight runtime support for process-level malleability, coordinating with the MPI runtime and job scheduler to manage resource adaptations. Within SERGHEI-SWE, resource reconfiguration is fundamentally a generalization of dynamic domain decomposition, to allow both the size and number of subdomains to change during execution. As a proof-of-concept, we implement minimal heuristics to trigger malleability based on wet-cell fractions: as flooded areas increase, additional resources are requested; when they decrease, resources are released.

The malleable SERGHEI-SWE was evaluated using dam-breaks, river flood, and catchment runoff tests. Numerical accuracy was preserved, with negligible differences relative to static (non-malleable) runs. Dynamic resource management improved computational efficiency relative to minimal fixed-resource configurations. However, performance remained below the best-case static maximum-resource setup, and communication overheads limited gains in low-demand phases. Nonetheless, the proof-of-concept demonstrates both feasibility and potential at larger scales.

The approach is accurate, robust, and promising for improving resource utilization in large-scale hydrodynamic modeling. Future work will focus on refining reconfiguration heuristics, improving understanding of overheads, and combining malleability with dynamic load balancing to better exploit scalable HPC environments.

How to cite: Caviedes-Voullième, D., Vallés, P., Segovia-Burillo, J., Morales-Hernández, M., Iserte, S., and Peña, A.: Simulating flood dynamics on dynamic HPC resource sets, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9317, https://doi.org/10.5194/egusphere-egu26-9317, 2026.

EGU26-9673 | ECS | Posters on site | ESSI2.2

Compression Safeguards - Towards Safe and Fearless Lossy Compression of Earth System Data 

Juniper Tyree, Daniel Köhler, Robert Underwood, Clément Bouvier, Tim Reichelt, Heikki Järvinen, and Milan Klöwer

The volume of data produced by Earth System Science models, e.g. high-resolution weather and climate models, is increasing faster than the methods and budgets for storing, sharing, and analysing this data. To reduce data sizes, lossy data compression methods discard some quality, details, or precision of the original data. Even though some lossy compressors promise size reductions of 100x or more, the lack of trust in lossy compression, rooted in the fear of losing important information, has so far limited their adoption.

We introduce compression safeguards to help overcome this trust gap by

(i) enabling scientist users to precisely express their (general or specific) safety requirements for lossy compression, e.g. preserving specific values, regionally varying error bounds on the data or quantities derived from it, or any logical combination thereof,

(ii) securing any (existing) (lossy) compressor with the corresponding safeguards, which then

(iii) guarantee that the safety requirements are always met by the safeguarded compressor.

Compression safeguards thus provide a unified and flexible interface for specifying and guaranteeing user safety requirements that works with any existing compressor. They therefore shift the burden of trust in fulfilling these requirements away from specific compressor implementations. With the appropriate safeguards, even untrusted, potentially unsafe compressors can be used safely and without fear. We hope that compression safeguards will provide Earth System scientists with the guarantees to use lossy compression safely and without fear, thereby helping to unlock the benefits of lossy compression in reducing data volumes for the Earth System Science community.

We will showcase how our reference implementation, compression-safeguards (https://compression-safeguards.readthedocs.io/en/latest/), can be applied to safeguard important properties in several real-world meteorological examples, evaluate the impact on compression ratio (only low for sparse corrections) and computational load at compression (major) and decompression (negligible) time, and discuss the future pathway towards safe and fearless lossy compression.

How to cite: Tyree, J., Köhler, D., Underwood, R., Bouvier, C., Reichelt, T., Järvinen, H., and Klöwer, M.: Compression Safeguards - Towards Safe and Fearless Lossy Compression of Earth System Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9673, https://doi.org/10.5194/egusphere-egu26-9673, 2026.

EGU26-9822 | ECS | Orals | ESSI2.2

Coupling km-Scale Earth System Model to Hierarchical Output for Analysis-Ready Dataset 

Siddhant Tibrewal and Nils-Arne Dreier

Kilometer-scale Earth System Model (ESM) simulations increasingly generate petabyte-scale datasets. The scientific return from such datasets still remains constrained by their accessibility and heterogenity as well as the cost of their downstream analysis. Analysts often rely on ad-hoc workflows and even analyses on reduced datasets require repeated access to high-resolution data, limiting scalability. We present Hiopy (Hierarchical Output in Python), a tool for generating cloud accessible, analysis-ready dataset directly from a km-scale ESM simulation using the ICON model by computing hierarchical temporal and spatial aggregations in situ. Building on the work of Kölling et al. (2024, EGU), Hiopy produces multi-resolution, self-describing datasets that enable seamless access from coarse to native resolution using the Zarr format. To mitigate the computational and communication overhead of in-situ aggregations, Hiopy uses YAC (Yet Another Coupler) to couple the model to the output component and configures the aggregates such that the model’s domain decomposition and the prefered Zarr chunking are aligned for even distribution of workload across the output processes. As a result of this, the communication overhead is reduced and efficient parallel computation is possible without penalising the simulation throughput. Additional optimisations reduce communication buffers, eliminate redundant duplications in metadata handling, allows streaming the data directly to its final location and eases configuration for varying requirements. Hiopy supports native ICON model grids, regular latitude–longitude grids and the HEALPix grid and has been validated by producing publicly accessible datasets from the km-scale ESM simulations across multiple projects. This work demonstrates a practical tool in the software stack of high resolution climate modelling.

How to cite: Tibrewal, S. and Dreier, N.-A.: Coupling km-Scale Earth System Model to Hierarchical Output for Analysis-Ready Dataset, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9822, https://doi.org/10.5194/egusphere-egu26-9822, 2026.

EGU26-10395 | ECS | Posters on site | ESSI2.2

Improvements to 6D Grid Optimisation in Vlasiator 

Leo Kotipalo, Urs Ganse, Yann Pfau-Kempf, Jonas Suni, and Minna Palmroth

Vlasiator is a global hybrid-Vlasov space plasma simulation, modeling the velocity distribution of ions in a large region of near-Earth space. Due to the high memory and computation demands of the kinetic method as well as the large physical scale, optimisations are required to make simulation feasible. This presentation explores optimisations used in the spatial and velocity grids.

We first consider the spatial dimension. For this, Vlasiator utilises cell-based octree adaptive mesh refinement (AMR). Essentially, each spatial cell may be split in all three spatial dimensions to create eight smaller children in order to improve simulation accuracy in relevant regions. This can be repeated if necessary, with runs typically using four levels of refinement. Refinement may be done statically at the start of the simulation, or dynamically based on the plasma parameters.

Vlasiator uses a combination of several parameters for dynamic runtime refinement. These include scaled gradients of macroscopic variables to detect steep changes, the ratio of the current density to perpendicular magnetic field for current sheets and reconnection, as well as pressure anisotropy and vorticity for foreshock refinement.

For the velocity grid we use a somewhat similar method of stretching. In order to simplify translation, the velocity grid is static and identical in each spatial cell. To eliminate splitting of acceleration pencils, the size of cells in each coordinate direction is a function of that coordinate. Thus if we consider a grid with higher resolution around some point, the grid appears stretched along the coordinate axes when moving away from that point. The main purpose of the stretched grid is to enable modeling of colder distributions requiring a higher resolution without increasing resolution for the entire velocity grid.

Combining these optimisations enables simulation on modern supercomputers with scale and resolution which would be unfeasible without them. This is achieved by limiting resources expended on regions where they are less critical for simulation accuracy and the scientific focus of a given run, while allowing higher fidelity in more important regions. These methods are applicable to other kinetic simulations, as well as grid-based simulations in general.

How to cite: Kotipalo, L., Ganse, U., Pfau-Kempf, Y., Suni, J., and Palmroth, M.: Improvements to 6D Grid Optimisation in Vlasiator, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10395, https://doi.org/10.5194/egusphere-egu26-10395, 2026.

EGU26-10512 | Posters on site | ESSI2.2

Finite-Difference modeling of elastic wave propagation in solid-moving fluid systems 

Max Dormann, Mudassar Razzaq, Claudia Finger, and Erik H. Saenger

Numerical simulations of elastic or acoustic wave propagation usually assume a stationary background medium. In many practical situations, however, such as marine exploration or the inspection of engineered structures pipelines, elastic waves propagate in bodies of moving fluid as well. Ambient flow fields introduce changes to the wave field such as a direction-dependent wave propagation velocity or phase shifts that can be observed in real-world measurements. To obtain simulations that more faithfully represent elastic wave propagation in coupled systems of stationary solids and moving fluids, and that are better suited for comparison with experimental, laboratory, and field data in the future, a formulation is introduced in which a material derivative expands the elastic wave equation. The resulting partial differential equation is solved using an augmented rotated-staggered finite-difference scheme that combines the spatial operators of the rotated-staggered grid with a conventional central-difference approximation. The performance of this new formulation is examined on the propagation of elastic wave fields in ambient steady uniform and steady laminar flow fields in combined fluid-solid models, and compared to reference simulation with no moving background medium. The analysis focuses on travel-time variations and phase shifts, demonstrating that the numerical results are consistent with analytical expectations for wave propagation in moving media.

How to cite: Dormann, M., Razzaq, M., Finger, C., and Saenger, E. H.: Finite-Difference modeling of elastic wave propagation in solid-moving fluid systems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10512, https://doi.org/10.5194/egusphere-egu26-10512, 2026.

EGU26-10844 | ECS | Orals | ESSI2.2

EBCC: an Error Bounded Climate-data Compressor 

Langwen Huang, Luigi Fusco, Jan Zibell, Florian Scheidl, Michael Armand Sprenger, Sebastian Schemm, and Torsten Hoefler

As the resolution of weather and climate simulations increases, the amount of data produced is growing rapidly from hundreds of terabytes to tens of petabytes. The huge size becomes a limiting factor for broader adoption, and its fast growth rate will soon exhaust all available storage devices. To address these issues, we present EBCC (Error Bounded Climate-data Compressor). It follows a two-layer compression approach: a base compression layer using JPEG2000 to capture the bulk of the data with a high compression ratio, and a residual compression layer using wavelet transform and SPIHT (Set Partitioning In Hierarchical Trees) encoding to efficiently eliminate long-tail extreme errors. EBCC outperforms other methods in the benchmarks at relative error targets ranging from 0.1% to 10%. In the energy budget closure and Lagrangian trajectory benchmarks, it can achieve more than 100× compression while keeping errors within the natural variability derived from ERA5 uncertainty members. We implement EBCC as a standalone C library which is seamlessly integrated with NetCDF and Zarr pipelines.

How to cite: Huang, L., Fusco, L., Zibell, J., Scheidl, F., Sprenger, M. A., Schemm, S., and Hoefler, T.: EBCC: an Error Bounded Climate-data Compressor, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10844, https://doi.org/10.5194/egusphere-egu26-10844, 2026.

EGU26-10878 | ECS | Posters on site | ESSI2.2

MPTRAC: Domain-decomposed Massively-Parallel Trajectory Calculations 

Jan Clemens, Lars Hoffmann, Rolf Müller, Felix Plöger, Marvin Henke, Nicole Thomas, Sabine Grießbach, and Catrin Meyer

Models for the calculation of Lagrangian particle dispersion in the atmosphere or the ocean are indispensable tools for understanding natural and anthropogenic processes. These processes range from volcanic ash clouds, through cloud microphysics to the study of the ozone layer on climate scales. With exascale machines at our disposal, such calculations can now be performed at significantly higher resolutions, both in terms of the driving wind field and particle number density.

Massive-Parallel Trajectory Calculations (MPTRAC) is a library designed to enable Lagrangian particle dispersion analysis for atmospheric transport processes in the free troposphere and stratosphere. It is optimized for modern high-performance computing infrastructure. MPTRAC was developed with contemporary high-performance computing (HPC) systems in mind, ensuring high scalability across GPU and CPU clusters through an MPI-OpenMP/ACC hybrid parallelization approach. Its data structures are tailored to the multi-layered cache systems of modern compute nodes. MPTRAC is routinely executed on the JUWELS-Booster supercomputer and is planned for deployment on the JUPITER exascale machine.

This contribution outlines ongoing developments in MPTRAC. A central aspect of the presented work is the implementation of domain decomposition, which partitions wind field data and associated tracer particles across distributed subdomains. This methodology promises to enhance computational efficiency and scalability, particularly in the context of large-scale atmospheric transport simulations. Furthermore, we detail the integration of MPTRAC with the ICON modeling framework through its community interface. This extension enables the direct application of particle-based transport methods within ICON, supporting high-resolution climate and weather simulations.

The described developments are conducted within the scope of the WarmWorld Project, which aims to enable high-resolution calculations using ICON.

MPTRAC is available under an open-source licence: https://github.com/slcs-jsc/mptrac

How to cite: Clemens, J., Hoffmann, L., Müller, R., Plöger, F., Henke, M., Thomas, N., Grießbach, S., and Meyer, C.: MPTRAC: Domain-decomposed Massively-Parallel Trajectory Calculations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10878, https://doi.org/10.5194/egusphere-egu26-10878, 2026.

EGU26-12783 | Posters on site | ESSI2.2

HERMES_Delta: An open source, python-based, parallel software to process official emission inventories and support air quality modelling efforts in Spain 

Carles Tena, Marc Guevara Vilardell, Johanna Gehlen, Paula Camps Pla, Oscar Collado, Luca Rizza, and Laura Herrero

Air pollution is one of the most critical environmental threats, contributing to respiratory and cardiovascular diseases and millions of premature deaths worldwide. To support air quality assessment, forecasting and planning efforts, chemical transport models (CTMs) need to be fed with robust, temporally and spatially resolved emission input data. 

Official annual national emission inventories prepared by countries to fulfill mandatory reporting obligations provide robust and consistent data. However, for their use in CTMs, emission data needs to be spatially distributed over a grid, temporally broken down into hourly resolution and chemically mapped to the species defined in the CTMs mechanism. Bridging the gap between official inventory data and CTM model-ready emission input needs requires a scalable, transparent, and reproducible system that can process raw inventories into gridded, hourly and chemically speciated CTM-compatible datasets.

HERMES_Δ is a open-source emission model developed at the Barcelona Supercomputing Center (BSC) to address this challenge. Implemented in object-oriented Python and designed to run on High Performance Computing (HPC) infrastructures, it integrates temporal, spatial, vertical, and chemical disaggregation within a modular architecture. Configuration relies entirely on YAML or CSV files, allowing activity- and region-specific settings while maintaining traceability by preserving the connection between modeled emissions and their original reporting sources. Spatial disaggregation, which is the most computationally demanding step, is parallelized using MPI and optimized through domain decomposition. The produced output files are fully compatible with multiple state-of-the-art CTMs, including CMAQ, CHIMERE; MOCAGE, WRF-CHEM and MONARCH.

To assess the performance of HERMES_Δ, multiple benchmark experiments were performed  on the MareNostrum 5 and CIRRUS Spanish HPC facilities. All tests were performed considering a destination grid of 0.005° (~500 m) resolution covering Spain (peninsular and balearic islands), estimating hourly and speciated emissions for 24 time steps. Performance benchmarking, including time-to-solution and memory profiling, indicates good parallel scalability and resource efficiency. This enables the production of hourly gridded emissions for over 10 000 activity–region combinations, while maintaining reproducibility and strict Coordinated Universal Time (UTC) alignment.

In conclusion, HERMES_Δ provides a robust framework for processing official emission inventories to high spatial and temporal resolutions using geolocated activity proxies. By combining national emission inventories with efficient HPC methods, the system improves the representativeness of emissions in CTMs, strengthens collaboration between emission inventory compilers and air quality modellers, and enables more detailed and realistic simulations for policy development and operational forecasting.

HERMES_Δ is currently being implemented as the emission core of the official Spanish air quality forecasting system operated by the Spanish Meteorological Agency (AEMET)

How to cite: Tena, C., Guevara Vilardell, M., Gehlen, J., Camps Pla, P., Collado, O., Rizza, L., and Herrero, L.: HERMES_Delta: An open source, python-based, parallel software to process official emission inventories and support air quality modelling efforts in Spain, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12783, https://doi.org/10.5194/egusphere-egu26-12783, 2026.

EGU26-14316 | ECS | Posters on site | ESSI2.2

High-Performance Computing Benchmarking for Coastal Hydrodynamic Modelling Using Delft3D Flexible Mesh 

Abdulaziz Alabduljalil, Nada Alsulaiman, Yousef Alosairi, and Tahani Hussain

High-resolution coastal hydrodynamic models are used increasingly to measure and support environmental assessment, crisis mitigation, and forecasting. Yet, these models become constrained by the computing resources available to them resulting in lower-grade results. High-Performance Computing (HPC) thus becomes essential to increase simulation speeds while maintaining high resolutions. In this study, we present a benchmarking of resource configurations for a coastal hydrodynamic model using Delft3DFM Flexible Mesh (D-Flow FM), utilizing HPC resources while focusing on parallel performance, scalability, and efficiency. Benchmarking experiments were run while comparing two MPI libraries, MPICH and Intel MPI, across multiple CPU core counts and partition combinations on both two-dimensional and three-dimensional model configurations, including barotropic and baroclinic configurations. The results showcase how varied the runtime performance becomes depending on the hydrodynamic configuration, MPI implementation, and HPC parallel partition, and how HPC hardware can affect which combination is best. The goal is to provide guidance on finding optimal HPC configurations, including resource allocation and MPI library use, when running coastal hydrodynamic models of high resolution and quality.

How to cite: Alabduljalil, A., Alsulaiman, N., Alosairi, Y., and Hussain, T.: High-Performance Computing Benchmarking for Coastal Hydrodynamic Modelling Using Delft3D Flexible Mesh, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14316, https://doi.org/10.5194/egusphere-egu26-14316, 2026.

EGU26-14912 | ECS | Posters on site | ESSI2.2

Integration of Historical and Modern Gravimetric Data to Model the Temporal Variation of the Gravity Field over Italy 

Gabriele Esposito, Roberta Ravanelli, and Mattia Crespi

This study is part of a broader effort to modernize the Italian gravimetric database and to support the computation of the new national geoid. Its primary aim is the integration of historical gravimetric measurements with modern observations, including data from ongoing airborne surveys, to establish a consistent framework for analyzing temporal variations of the gravity field across Italy. The current study addresses the initial phase of this effort, including the digitization of historical records, the transformation of legacy coordinates into the official Italian geodetic reference frame, a preliminary GIS-based visualization, and the design of a unified database for future spatial and temporal analyses.

Historical gravimetric records from major volumes edited by the former Italian Geodetic Commission (Ballarin, 1936; Cunietti & Inghilleri, 1955; Riccò, 1903; Solaini, 1939; Soler, 1930), covering the late 19th century to the 1960s, were digitized. Pages were scanned at high resolution, and image enhancement techniques, including noise reduction, contrast adjustment, and edge sharpening, were applied to improve legibility and data extraction.

Digitization employed AI-based optical character recognition (OCR) using DeepSeek OCR (Wei et al., 2025), supported by ChatGPT-4 and ChatGPT-5 (OpenAI, 2023, 2025) for table-structure interpretation. This workflow enabled accurate recognition of degraded or complex tables, merged cells, and inconsistent delimiters. Data were initially stored in editable Excel spreadsheets as an intermediate validation step to verify, correct, and standardize key parameters, including geographic coordinates, orthometric height, absolute gravity measurements, year of observation, and survey campaign information. Historical coordinates referred to old Italian datums (Roma1940, ED1950, or other local datums) were converted to WGS84 (EPSG:4326) to ensure compatibility with modern measurements. A key challenge stemmed from the heterogeneity of the legacy reference frame, which required accurate datum transformations for reliable integration with contemporary datasets.

Following digitization and coordinate conversion, historical data are being prepared for integration with modern gravimetric measurements from the national network and ongoing airborne surveys. Initial GIS-based visualization provides an early assessment of spatial coverage and potential inconsistencies. The unified database is designed to manage spatial variability and temporal evolution of gravity and is scalable to accommodate future datasets.

Once fully established, the dataset will undergo quality control and validation using statistical and geospatial methods. While temporal gravity modeling lies beyond the scope of this contribution, the proposed workflow lays a solid foundation for subsequent analyses.


References

Ballarin, S., 1936: Trentadue determinazioni di gravità relativa. Commissione geodetica italiana.

Cunietti, M., Inghilleri, G., 1955: Rete Gravimetrica Fondamentale Italiana. Commissione geodetica italiana.

OpenAI. 2023. GPT‑4 Technical Report: https://cdn.openai.com/papers/gpt-4.pdf.

OpenAI. 2025. GPT‑5 System Card (Technical Overview): https://cdn.openai.com/gpt-5-system-card.pd

Riccò, A., 1903: Determinazione della Gravità Relativa in 43 Luoghi della Sicilia Orientale delle Calabrie. Memorie della Società Degli Spettroscopisti Italiani.

Soler, E., 1930: Due Campagne Gravimetriche sul Carso. Università di Padova.

Solaini, L., 1939: Determinazione di gravità relativa eseguite a Castelnuovo Scrivia, Tortona, Alessandria, Valmadonna, S. Salvatore Monferrato e Sannazzaro De' Burgondi nell'anno 1939. Commissione geodetica italiana.

Wei, H., Sun, Y., Li, Y. . DeepSeek-OCR: Contexts Optical Compress



How to cite: Esposito, G., Ravanelli, R., and Crespi, M.: Integration of Historical and Modern Gravimetric Data to Model the Temporal Variation of the Gravity Field over Italy, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14912, https://doi.org/10.5194/egusphere-egu26-14912, 2026.

EGU26-15196 | Posters on site | ESSI2.2

Zarr at scale: virtualization, sharding, and performance optimizations for Earth science data 

Max Jones, Joe Hamman, Davis Bennett, Kyle Barron, and Justus Magin

As geoscientific datasets continue to grow in size and complexity, the Zarr community has developed a modern, open-source solution for storage and I/O of multi-dimensional arrays and metadata. Zarr offers a high-performance, highly scalable, cloud-native container for scientific data, which allows scientists to transcend the constraints of individual files and think in terms of coherent datasets. Zarr’s potential has led to widespread adoption across government, industry, and academia. In this presentation, we offer practical guidance for how to leverage the latest and greatest features in the Zarr ecosystem, including:

  • Sharding to reduce the number of files, benefiting HPC users in particular
  • Virtualization via VirtualiZarr and Icechunk to enable high-performance access to data spread across NetCDF4/HDF5, GRIB, or GeoTIFF files
  • Custom data types, compression schemes, and variable chunk grids
  • Client-side (i.e., in-browser) rendering of large multidimensional geospatial datasets

Through concrete examples and best practices, we demonstrate how the Zarr ecosystem enables researchers to work with multi-terabyte datasets as seamlessly as small files.

How to cite: Jones, M., Hamman, J., Bennett, D., Barron, K., and Magin, J.: Zarr at scale: virtualization, sharding, and performance optimizations for Earth science data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15196, https://doi.org/10.5194/egusphere-egu26-15196, 2026.

EGU26-15706 | Orals | ESSI2.2

Histogram compression of large ensemble forecasts 

Fenwick Cooper, Shruti Nath, Antje Weisheimer, and Tim Palmer

1000 member ensemble forecasts of rainfall are compressed from ~230 MB to ~400 KB using lossy histogram compression. This level of compression allows fast download, analysis and responsive display on a website, even when using obsolete laptop computers or basic smartphones. The information lost to achieve this level of compression is ignored in all but the most specialist of applications, and the algorithm scales to much higher ensemble sizes with negligible additional storage. The method is currently in operation every day with national meteorological centres in East Africa.

 

Physics based weather models are routinely used produce ensemble forecasts with up to 100 members. These ensembles are an advance on single deterministic forecasts, in that they indicate uncertainty. With larger ensembles providing more accurate distributions of forecast variables. The downside of large ensembles is their storage, transmission and processing cost. Furthermore, machine learning models are being used operationally to generate very large forecast ensembles. For example, rainfall forecasts by ICPAC and national meteorology centres in East Africa are now routinely produced with 1000 ensemble members. Analysis and transmission of these forecasts using traditional methods is completely impractical given currently available hardware. Compression is necessary and can be achieved by storing the ensemble as a series of histograms, sacrificing spatial correlation information.

How to cite: Cooper, F., Nath, S., Weisheimer, A., and Palmer, T.: Histogram compression of large ensemble forecasts, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15706, https://doi.org/10.5194/egusphere-egu26-15706, 2026.

EGU26-16260 | ECS | Posters on site | ESSI2.2

Breaking Computational Bottlenecks in Land Surface Modelling with Shifted-Window Transformers 

Siddik Barbhuiya and Vivek Gupta

The development of hyper-resolution land surface modelling poses significant computational challenges. Detailed water balance assessments, ensemble-based uncertainty quantification, and climate scenario exploration all require running physics-based models like VIC, Noah-MP, and CLM at continental scales with high spatial resolution, long temporal spans, and multiple parameter configurations. The computational cost becomes prohibitive. Machine learning surrogates have recently emerged as potential solutions; however, existing LSTM and CNN approaches have fundamental architectural problems. Sequential processing prevents parallel computation, limited receptive fields miss long-range dependencies, and most approaches only predict single variables, which restricts comprehensive hydrological analysis.

We present a shifted-window transformer framework that simultaneously predicts multiple land surface fluxes (runoff, evapotranspiration, and soil moisture) while maintaining computational efficiency at continental scales. The hierarchical attention mechanism captures both local temporal patterns through windowed self-attention and global temporal context through shifted-window operations. This eliminates recurrent bottlenecks. We adapt vision transformers for hydrological regression by tokenizing meteorological sequences temporally, using relative position biases to encode lag-dependent hydrological relationships, and designing multi-task regression heads that preserve both nonlinear interactions and direct physical drivers.

We demonstrate the approach by emulating the VIC model across India's 76,390 land grid cells at 6 km resolution, spanning diverse climate regimes. Training uses sparse spatial sampling with only a small fraction of available locations. This allows us to evaluate how well the surrogate generalizes VIC's process behaviors to the newer, unseen regions and parameter configurations. We test multiple variants, including autoregressive formulations that incorporate previous timestep outputs, and benchmark everything against LSTM baselines to isolate the contributions of the architecture.

How to cite: Barbhuiya, S. and Gupta, V.: Breaking Computational Bottlenecks in Land Surface Modelling with Shifted-Window Transformers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16260, https://doi.org/10.5194/egusphere-egu26-16260, 2026.

EGU26-16512 | Orals | ESSI2.2

Good model performance? 

Edwin Sutanudjaja, Saeb Faraji Gargari, and Oliver Schmitz

For environmental scientists like hydrologists or ecologists, the performance of a model mostly refers to how well a simulation run mimics the modelled phenomenon, often evaluated by a broad range of measures comparing the simulated output to observed data. Increasing the model performance is then an ongoing process of improving the model by incorporating new or refining the existing implementation of environmental processes, possibly combined with using improved datasets at higher spatial and temporal resolutions. This, however, increases the computational burden of the simulations. Improving the computational performance of a model to efficiently support a range from stand-alone computers to HPC systems is typically not in the scope of an environmental scientist, while a reduced runtime would be beneficial for the entire modelling cycle.


The LUE (https://zenodo.org/records/16792016) environmental modelling framework is a software package for building HPC-ready simulation models. The Python bindings provide domain scientists a large set of spatial operations for model building. All LUE operations are implemented in C++ using HPX (https://doi.org/10.5281/zenodo.598202), a library and runtime environment providing an optimal asynchronous execution of interdependent tasks on both shared-memory and distributed computing systems. Models constructed with LUE can therefore run on HPC systems without further modifications of the Python code and without explicit knowledge of programming HPC systems. In addition, the lue.pcraster Python sub-package provides an almost effortless transformation of existing PCRaster Python based models to LUE. In our presentation we showcase PCR-GLOBWB (https://doi.org/10.5194/gmd-11-2429-2018), a model simulating hydrology and water resources at a global scale, as an example of transforming an existing large scientific code base to LUE. We also demonstrate how efficiently the model now uses hardware ranging from one to thousands of CPUs, and therefore is prepared for global modelling studies at resolutions finer than 1 km.

How to cite: Sutanudjaja, E., Faraji Gargari, S., and Schmitz, O.: Good model performance?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16512, https://doi.org/10.5194/egusphere-egu26-16512, 2026.

EGU26-17230 | ECS | Posters on site | ESSI2.2

Multi-GPU acceleration of high-resolution and large-scale urban flood modelling using MPI–OpenACC 

Bomi Kim, Hyungon Ryu, Seungsoo Lee, Jun-Hak Lee, and Seong Jin Noh

High-resolution urban flood modelling is increasingly critical for disaster mitigation, but simulations remain computationally expensive, particularly when applying meter-scale grids over large spatial domains. Such computational constraints often restrict the practical use of high-resolution simulations in operational forecasting and scenario-based analyses. To address this challenge, this study investigates the use of multi-GPU acceleration to improve computational efficiency in large-scale urban flood simulations. We present a multi-GPU implementation of the H12 2D urban flood model based on an MPI–OpenACC framework. The H12 2D model is a physics-based two-dimensional urban flood model that supports CPU-based parallel execution and is extended here to GPU architectures. The proposed approach employs directive-based parallelization. This approach allows a single code base to be executed on both CPU and GPU systems without extensive code modification. Domain decomposition is managed using MPI, while computationally intensive kernels are offloaded to GPUs through OpenACC directives. This hybrid design ensures portability across heterogeneous high-performance computing environments and enables efficient use of multiple GPUs. We evaluate performance using spatial resolutions ranging from 1 to 20 m over two contrasting domains: an urban catchment in downtown Portland, Oregon (USA), and a downstream reach of the Han River basin (Republic of Korea). This study will discuss how computational performance varies with model resolution, domain size, and the distribution of computational workload across multiple GPUs, with a focus on scalability and parallel efficiency. The improved computational efficiency achieved in this study can support pseudo real-time urban flood prediction for early warning applications. In addition, the proposed framework facilitates large-scale, high-resolution simulations that can be used to generate ground-truth datasets for the development and validation of physics-informed or data-driven flood prediction models.

How to cite: Kim, B., Ryu, H., Lee, S., Lee, J.-H., and Noh, S. J.: Multi-GPU acceleration of high-resolution and large-scale urban flood modelling using MPI–OpenACC, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17230, https://doi.org/10.5194/egusphere-egu26-17230, 2026.

EGU26-17426 | Posters on site | ESSI2.2

Enabling numerical Models as a Service (MaaS) 

Stefan Verhoeven, Bart Schilperoort, Peter Kalverla, and Rolf Hut

Running numerical models you are unfamiliar with is not always straightforward. The models have different kinds of interfaces, different program languages, and different names for the same concepts. To standardize this, the Basic Model Interface (Hutton, 2020) was developed by the Community Surface Dynamics Modeling System (CSDMS). With the Basic Model Interface (BMI), users are presented with a standard set of functions to query and control numerical models. This standard interface also allows users to couple models together, allowing for the creation of standard components that can be coupled to create a full model (Peckham, 2013). 

However, coupling these models or components, whether they are written in C, C++, Fortran or Python, requires them to all share the same interpreter or (Python) environment. This is not always possible or viable and can require compilation on the end-user's side. This also prevents containerization of models. 

For cross-language and cross-container communication we developed grpc4bmi in 2018, making it possible to use the BMI over a HTTP connection. However, while highly performant, gRPC is not supported in many languages. To this end, we developed the new RemoteBMI protocol. RemoteBMI can communicate to models using the Basic Model Interface using a RESTful API, making it easier to support any language; only a HTTP server and JSON parser implementation are required. 

With grpc4bmi and RemoteBMI it is possible to package a model or model component inside a software container (e.g., Docker) and communicate with these models over an HTTP connection. This makes models more interoperable and reproducible, as container images can easily be archived and used by other people. It also enables running models on different machines than your own, and then directly communicating with them or coupling them to other models. 

With these technologies, you can now, for example, host models that require specific and difficult-to-share input data and provide them to anyone interested as a web-based service. This model-as-a-service (MaaS) architecture could also make it easier for end-users to try out your model in the browser before committing to installing it locally if they are interested. 

Currently, the grpc4bmi and RemoteBMI protocols are used by the eWaterCycle platform (Hut, 2022), allowing hydrologists and students easy access to containerized hydrological models through a common interface, accelerating both research and teaching. 

 --- 

Hutton, E.W.H., Piper, M.D., and Tucker, G.E., 2020. The Basic Model Interface 2.0: A standard interface for coupling numerical models in the geosciences. Journal of Open Source Software, 5(51), 2317, https://doi.org/10.21105/joss.02317. 

Peckham, S.D., Hutton, E.W., and Norris, B., 2013. A component-based approach to integrated modeling in the geosciences: The design of CSDMS. Computers & Geosciences, 53, pp.3-12, http://dx.doi.org/10.1016/j.cageo.2012.04.002. 

Hut, R., et al. (2022). The eWaterCycle platform for open and FAIR hydrological collaboration. Geoscientific Model Development, 15(13), 5371–5390. https://doi.org/10.5194/gmd-15-5371-2022  

How to cite: Verhoeven, S., Schilperoort, B., Kalverla, P., and Hut, R.: Enabling numerical Models as a Service (MaaS), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17426, https://doi.org/10.5194/egusphere-egu26-17426, 2026.

EGU26-19114 | Orals | ESSI2.2

Enhancing Earth system models efficiency: Leveraging the Automatic Performance Profiling framework 

Roc Salvador Andreazini, Xavier Yepes Arbós, Oriol Tintó Prims, Stella Paronuzzi Ticco, and Mario Acosta Cobos

The continuous increase in spatial and temporal resolution of Earth System Models (ESMs) is essential to better represent physical processes and extreme events. However, these advances come at a rapidly growing computational cost, pushing simulations towards unprecedented levels of parallelism on modern High Performance Computing (HPC) architectures. As a result, inefficiencies in load balance, communication, I/O, and memory usage increasingly limit scalability and scientific throughput.

Identifying and addressing parallel performance bottlenecks in large, multi-component climate models remains a complex and time-consuming task, often requiring specialized HPC expertise and manual profiling workflows. This represents a significant barrier for model developers aiming to efficiently exploit current and future exascale systems.

We present the Automatic Performance Profiling (APP) framework, an automated and extensible workflow designed to provide performance analysis of high-resolution ESMs. APP runs end-to-end profiling experiments and generates a comprehensive, multi-level performance report that combines high-level metrics (e.g., simulated years per day (SYPD) and scalability curves) with detailed insights into MPI communication patterns, cache behavior, and function profiling. This approach enables systematic identification of bottlenecks arising from extreme concurrency and fine spatial/temporal resolution demands.

Integrated with the Autosubmit workflow manager, APP facilitates reproducible performance studies, cross-platform and model configurations/resolutions comparisons. Its modular design supports multiple climate models (NEMO and ECE4) and HPC systems (BSC’s MN5 and ECMWF’s HPC2020) and allows straightforward extension to new HPC platforms and models.

By lowering the barrier to parallel performance analysis, APP empowers the climate modelling community to improve scalability and resource efficiency, supporting the sustainable development of next-generation high-resolution ESMs.

How to cite: Salvador Andreazini, R., Yepes Arbós, X., Tintó Prims, O., Paronuzzi Ticco, S., and Acosta Cobos, M.: Enhancing Earth system models efficiency: Leveraging the Automatic Performance Profiling framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19114, https://doi.org/10.5194/egusphere-egu26-19114, 2026.

EGU26-20436 | Orals | ESSI2.2

Minimising I/O, maximising throughput: earthkit-workflows, a task-graph engine for heterogeneous systems  

Jenny Wong, Vojtech Tuma, Harrison Cook, Corentin Carton de Wiart, Olivier Iffrig, James Hawkes, and Tiago Quintino

In-memory HPC workflows promise significant performance gains by reducing I/O, but achieving these gains requires precise scheduling of data-dependent task graphs on heterogeneous computing platforms. While existing Python frameworks such as Dask provide abstractions for parallel execution, they are not designed to fully exploit advanced topology-aware scheduling, natively support tightly coupled CPU-GPU task graphs in complex HPC environments, or utilise captured profiling information during scheduling. 

Earthkit-workflows is a Python library with a declarative API for constructing task graphs, and the capability to schedule and execute them on local or remote resources. It targets heterogeneous environments, enables task-based parallelism across CPUs, GPUs, and distributed HPC or cloud systems. Expensive I/O operations and intermediate storage are minimised via shared memory and high-speed interconnects, allowing intermediate results to be exchanged efficiently during task-graph execution. Streaming outputs from tasks, such as stepwise forecasting, are given first-class support, to allow starting downstream tasks without delay. The library also offers extensible graph-building interface with a plugin mechanism, allowing users to define custom operations, and interoperates seamlessly with the wider earthkit ecosystem. 

The task-graph construction and execution capabilities of earthkit-workflows are being applied in ECMWF’s next generation of data processing frameworks. Individual data processing functions are published as modular and reusable graphs, enriched with profiling measurements, and then combined together to form operational workflows. Two operational workflows which happen to have a subgraph in common, for example two subgraphs retrieving the same data as input, can be automatically merged for efficient resource utilisation. For operational robustness, checkpointing capability is also provided. 

Earthkit-workflows additionally finds application as the core of Forecast-in-a-Box, ECMWF’s offering that combines data-driven weather forecasting models with meteorological product generation, in a manner portable to personal workstation, high power local device or cloud computing, and aimed at non-technical users. Support for GPU is particularly critical, enabling efficient inference for data-driven weather forecasting models, not limited to HPC environments. 

How to cite: Wong, J., Tuma, V., Cook, H., Carton de Wiart, C., Iffrig, O., Hawkes, J., and Quintino, T.: Minimising I/O, maximising throughput: earthkit-workflows, a task-graph engine for heterogeneous systems , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20436, https://doi.org/10.5194/egusphere-egu26-20436, 2026.

EGU26-20630 | ECS | Orals | ESSI2.2

Compression and Reconstruction of High-Dimensional Weather Simulation Data Using Tensor Decompositions 

Clara Hartmann, Rafael Ballester-Ripoll, Julian A. Croci, Jorge Gacitua Gutierrez, Juan Jose Ruiz, Paola Salio, Alexandra Diehl, and Renato Pajarola

High-resolution numerical weather and climate simulations increasingly produce very large data with high dimensionality. Such datasets usually span three spatial dimensions, time, multiple physical variables, and ensemble members, leading to six-dimensional (6D) hypervolume datasets. Being grid-based, these datasets can be interpreted as 6D data tensors. The storage, processing, visualization, and analysis of such large data poses significant computational and memory storage challenges. Tensor decomposition and approximation methods have proven to be an efficient tool for compression and reconstruction of such large, high-dimensional scientific datasets. Using rigorous mathematical principles, tensor decompositions are exploiting multi-linear structure and redundancy inherent in scientific data, leading to an effective compression of the datasets while providing visually accurate results.

In this work, we investigate the applicability of tensor decompositions for the compression and efficient representation of 6D weather simulation data. We focus on two of the state-of-the-art low-rank tensor formats, tensor-train (TT) and Tucker decompositions. These methods generalize the singular value decomposition (SVD) to higher-order tensors, enabling compression of spatial, temporal, and physical modes through rank reduction. Therefore, the large high-dimensional tensor is factorized into multiple smaller, rank-reduced tensors with lower dimensionality, reducing the size of the original data significantly while preserving essential features. Such a reduced representation is also called a tensor approximation (TA).

We apply the tensor decompositions to a real-world weather simulation dataset from the Alpine region of Switzerland (COSMO-1E), organized along longitude, latitude, vertical level, time, physical variables (such as temperature), and 11 ensemble dimensions. We evaluate the performance of the compression in terms of storage reduction, relative reconstruction error, peak-signal-to-noise-ratio (PSNR), structural similarity index measure (SSIM), computational costs, and visual comparison to the original data. Our results demonstrate significant compression ratios while preserving high visual accuracy. For example, a TT-based compression with a compression ratio of 1 : 900 provides results with a relative error of only 0.0005. The obtained compression ratio reduces the size of 4GB of the original dataset to 4.6MB for the compressed dataset. Lower compression ratios lead to even higher accuracy.

Beyond efficient data compression, the linear structure of the tensor decompositions allows for efficient application of filters in the tensor domain. The computation of the mean, standard deviation or similar linear operations along user-defined dimensions can directly be performed on the decomposed tensors, without ever having to reconstruct the large 6D dataset. Furthermore, the structure of the tensors allows for efficient partial reconstruction and visualization of slices or subsets of the dataset without reconstructing the complete dataset.

Overall, this work highlights tensor decompositions as powerful tool for managing the growing size and complexity of high-dimensional weather simulation data. Their linear structure, which allows for efficient filter application in the compressed domain, makes them especially suitable for scientific analysis of complex datasets. Their integration into geoscientific data pipelines offers a promising pathway towards scalable and accurate data compression and analysis in numerical weather prediction and climate science. 

How to cite: Hartmann, C., Ballester-Ripoll, R., Croci, J. A., Gacitua Gutierrez, J., Ruiz, J. J., Salio, P., Diehl, A., and Pajarola, R.: Compression and Reconstruction of High-Dimensional Weather Simulation Data Using Tensor Decompositions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20630, https://doi.org/10.5194/egusphere-egu26-20630, 2026.

EGU26-20880 | ECS | Orals | ESSI2.2

Evaluating Tensor Decomposition and Approximation as Lossy Compression for Weather Data Visualization Tasks 

Julian A. Croci, Marc Rautenhaus, Clara Hartmann, Jorge Gacitua Gutierrez, Juan Jose Ruiz, Paola Salio, Alexandra Diehl, and Renato Pajarola

Major challenges with modern weather and climate simulations are the resources required to store, analyze and visualize the generated data. This storage problem forces scientists to compromise on data dimensionality, for example by discarding physical variables or the reduction of temporal timestamps

Tensor decomposition and approximation (TA) methods recently got a revival in the context of neural networks, to reduce the number of network parameters. However, TA methods also exhibit interesting properties favorable for the lossy compression of volumetric data. For example, for turbulence volumes created by simulations, compression ratios higher than 300 can be reached while preserving high precision. This allows for more efficient storage of large multi-dimensional data grids. Furthermore, tensor decompositions allow for partial reconstruction as well as the application of linear functions in the compressed domain, making these representations especially suitable for a variety of down-stream tasks analyzing the data such as statistical analysis. However, one open question, as for all lossy compression techniques, is, how the loss influences the quality of said tasks.

For the operationalization of TA methods, another challenge is their parametrization. Various decomposition techniques exist and selecting the most appropriate one is non-trivial. Further, data likely needs to be divided into smaller pieces, e.g. chunks, to achieve the best results, meaning high compression ratios while introducing as little error as possible. The division of the data in this context can mean both, omitting dimension (and hence reducing the dimensionality of the tensor) as well as splitting the data within dimensions. Finally, different tensor decomposition methods allow for different setups, further widening the compression parameter space to explore.

In this work we present an experimental setup that verifies compression performance regarding error metrics directly on the data as well as impact of the compression losses in downstream visualization tasks. We are using an offline TA-based compression scheme in which the data is reconstructed, i.e. decompressed, before saving it again in a standard format and hence being easily able to be fed into downstream visualization applications such as Met.3D. On this example, we will discuss how numerical error metrics, such as the relative error or the RMSE, are not always representative for errors in the visualization of the data in downstream tasks, especially for variables derived from the data. Further, we present different strategies for partitioning the data into chunks and motivate the effectiveness of tensor decomposition methods in the domain of numerical weather forecast data.

How to cite: Croci, J. A., Rautenhaus, M., Hartmann, C., Gacitua Gutierrez, J., Ruiz, J. J., Salio, P., Diehl, A., and Pajarola, R.: Evaluating Tensor Decomposition and Approximation as Lossy Compression for Weather Data Visualization Tasks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20880, https://doi.org/10.5194/egusphere-egu26-20880, 2026.

The goal of the eWaterCycle project is to facilitate hydrological modelling being done Findable, Accessible, Interoperable & Reproducible (FAIR).

High (hyper) resolution and / or large sample hydrological modelling, including those driven by Destination Earth (DestinE) Digital Twin (DT) inputs, often require HPC infrastructure for model runs. Designing such studies, however, benefit from users working on interactive Cloud Infrastructures. Migrating workflows from Cloud Infrastructure to HPC infrastructure requires deep knowledge of the systems in place, which typical (hydrological expert) users don’t have. A core design philosophy of the eWaterCycle platform is that domain (hydrology) users should not need to become computer science experts to carry out their hydrological research.

 

To address this, we have developed a workflow that seamlessly upscales any hydrological workflow designed on Cloud Infrastructure to a SLURM high performance compute cluster, with small changes compared to working from the cloud environment: setting up paths (e.g. scratch folders) and supplying key argument parameters (e.g. the specified region). Users are not required to have any prior knowledge of HPC systems beforehand.

 

This 'seamless' workflow can be run from any jupyterhub environment. We are using and are in development of integrating with the services that are part of DestinE.

As a ‘large sample hydrology’-example: we run a climate change impact on flood frequency analysis on each of the 6830 catchments in the entire . We facilitate using ERA5, ERA-Interim and CMIP6 data as well as the data provided by the Digital Twin (DT) as input to these model runs. In this presentation we will be sharing our results obtained using eWaterCycle with the DT data. This workflow will serve as an example of our seamless scaling between Cloud Infrastructure and HPC systems and provide lessons learned for others setting up similar services.

How to cite: Melotto, M., Hut, R., and Vitolo, C.: Hydrological simulations with seamless scaling between Could and High Performance Computing environments on DestinE from the comfort of your own browser using eWaterCycle., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3863, https://doi.org/10.5194/egusphere-egu26-3863, 2026.

EGU26-7413 | Orals | ESSI2.3

Provisioning a Cloud-Native Training Infrastructure for the MediTwin Summer School 2025  

Federico Fornari, Claudio Pisa, Marica Antonacci, Vasileios Baousis, Tolga Kaprol, and Mohanad Albughdadi

MediTwin is a European research initiative aimed at developing digital twin technologies for the Mediterranean region, integrating Earth observation data, numerical modelling and artificial intelligence to support environmental monitoring and decision-making. In this context, the MediTwin Summer School 2025 was organised to provide hands-on training on data-driven workflows, cloud-native tools and AI/ML techniques for Earth system applications. The school targeted early-career researchers, PhD students and technical staff from research institutions, with a total of 20 participants. 

The School required a scalable, secure and reproducible cloud infrastructure capable of supporting hands-on training activities in Earth system digital twins, data analysis and AI/ML workflows. This contribution presents the design and provisioning of the cloud-native infrastructure deployed to support the school, with a focus on Infrastructure as Code (IaC), Kubernetes-based orchestration and hybrid GPU-enabled workloads. 

The infrastructure was deployed on the ECMWF on-premises cloud, based on OpenStack and backed by Ceph software-defined storage, providing elastic compute, networking and persistent storage services. The Kubernetes cluster was provisioned in a high-availability configuration using Terraform and Rancher Cluster Manager, following established GitOps best practices. The cluster architecture comprised dedicated control-plane, worker, ingress and GPU nodes, enabling both standard cloud-native services and accelerated AI/ML workloads. Cluster lifecycle management, configuration drift prevention and application delivery were handled through a GitOps approach using Rancher Fleet. 

GitLab acted as the central orchestration platform for source control, CI/CD pipelines and IaC automation, hosting Terraform modules, Helm charts, Rancher cluster definitions and configuration templates. This ensured full traceability, auditability and reproducibility of both infrastructure and application deployments. Sensitive credentials and API keys were securely managed using HashiCorp Vault and dynamically injected into workloads. 

To support interactive training activities, a JupyterHub service was deployed on Kubernetes using the official Helm chart, customised for resource management, authentication and storage integration. GPU acceleration was enabled via the NVIDIA GPU Operator, which automated driver installation, device discovery and scheduler integration. In addition, outside the Kubernetes environment, 20 GPU-enabled virtual machines were provisioned directly on OpenStack using an Ansible role executed through AWX, itself deployed on the Kubernetes cluster, to accommodate specific student exercises requiring isolated VM-based access. 

This experience demonstrates how modern cloud-native and DevSecOps practices can be effectively applied to provision short-lived yet production-grade scientific training infrastructures, ensuring scalability, security and reproducibility for future Earth observation and digital twin education initiatives. 

How to cite: Fornari, F., Pisa, C., Antonacci, M., Baousis, V., Kaprol, T., and Albughdadi, M.: Provisioning a Cloud-Native Training Infrastructure for the MediTwin Summer School 2025 , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7413, https://doi.org/10.5194/egusphere-egu26-7413, 2026.

EGU26-7440 | Posters on site | ESSI2.3

A Cloud-Based Infrastructure for EO-Driven Climate Resilience Services  

Federico Fornari, Marica Antonacci, Claudio Pisa, Vasileios Baousis, Tolga Kaprol, and Mohanad Albughdadi

CLIMRES (LEADERSHIP FOR CLIMATE RESILIENT BUILDINGS) is a European project addressing the growing vulnerability of buildings and urban environments to climate change impacts. The project combines climate data, Earth Observation (EO) products, and impact assessment methodologies to identify climate-driven hazards, assess building and urban-scale vulnerabilities, and support decision-making through dedicated tools and measures. These solutions are validated through large-scale pilot demonstrations across several European countries. 

A key enabling outcome of CLIMRES is the design and deployment of a scalable, cloud-native infrastructure hosting the project’s Federated Data Exchange Platform (FDXP) and digital services. The infrastructure is hosted by ECMWF on a dedicated Kubernetes cluster within the Common Cloud Infrastructure (CCI), part of the European Weather Cloud co-managed by ECMWF and EUMETSAT. The underlying cloud environment is based on OpenStack with Ceph storage, providing elastic compute and scalable object storage capabilities for data-intensive workloads. This infrastructure provides the technical backbone for integrating heterogeneous datasets, executing data-processing workflows, and delivering operational services that underpin climate resilience assessments and decision-support applications. 

The CLIMRES platform follows cloud-native design principles and adopts containerization and microservice-based architectures to ensure modularity, scalability, and operational robustness. Kubernetes is used as the core orchestration layer, while Rancher provides centralized cluster management, monitoring, and operational visibility. All services, including the FDXP and supporting applications, are deployed consistently across environments using GitOps principles, ensuring reproducibility, traceability, and elimination of configuration drift. 

Continuous Integration and Continuous Delivery (CI/CD) pipelines automate the full software lifecycle, from source code changes to container image building and deployment. Docker images are built through automated pipelines and deployed via Git-driven workflows, enabling transparent, auditable, and predictable releases. Semantic versioning and changelog generation are fully automated, ensuring consistent release management across services. 

This contribution presents the CLIMRES cloud infrastructure as a production-ready case study for EO- and climate-driven applications. It demonstrates how cloud-native technologies can effectively support scalable data management platforms and operational services for climate resilience. 

How to cite: Fornari, F., Antonacci, M., Pisa, C., Baousis, V., Kaprol, T., and Albughdadi, M.: A Cloud-Based Infrastructure for EO-Driven Climate Resilience Services , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7440, https://doi.org/10.5194/egusphere-egu26-7440, 2026.

EGU26-8000 | ECS | Orals | ESSI2.3

TACO: Operationalizing AI-Ready EO datasets 

Oscar J. Pellicer-Valero, Cesar Aybar, Mikolaj Czerkawski, Carmen Oliver, Kevin Monsálvez, Julio Contreras, and Gustau Camps-Valls

The field of Artificial Intelligence for Earth Observation (AI4EO) currently suffers from significant data friction, especially when moving Petabyte-scale archives from cloud object storage to High Performance Computing (HPC) nodes. We present the TACO (Transparent Access to Cloud-Optimized datasets), a production-grade standard designed to replace file-centric legacy workflows with a high-throughput streaming paradigm.

We showcase the practical implications of this architecture for the deployment of geospatial Foundation Models (FMs), by running pretrained FMs on downstream inference tasks (such as semantic segmentation or land-cover classification) directly on arbitrary samples of arbitrary cloud-hosted datasets, quickly, and without the need for local staging or any specific preprocessing. TACO bridges the gap between static cloud archives and dynamic HPC processing, allowing seamless, scalable AI4EO workflows, and fulfilling the so far unfulfilled promise of FMs of "train once, apply everywhere".

References: 

  • Cesar Aybar, et al. (2025). The Missing Piece: Standardising for AI-ready Earth Observation Datasets. Poster at TerraBytes-ICML 2025 Workshop. Vancouver, Canada
  • TACO Foundation. (2025, November 21). The TACO specification (Version 2.0.0). https://tacofoundation.github.io/specification 

How to cite: Pellicer-Valero, O. J., Aybar, C., Czerkawski, M., Oliver, C., Monsálvez, K., Contreras, J., and Camps-Valls, G.: TACO: Operationalizing AI-Ready EO datasets, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8000, https://doi.org/10.5194/egusphere-egu26-8000, 2026.

EGU26-9485 | Orals | ESSI2.3

The Kilometer-Scale Cloud by DKRZ 

Fabian Wachsmann and Chaturika Wickramage

We present the Kilometer-Scale Cloud data server (https://km-scale-cloud.dkrz.de) and its underlying software stack Cloudify (https://gitlab.dkrz.de/data-infrastructure-services/cloudify) developed and deployed at the German Climate Computing Center (DKRZ). The km-scale cloud provides open and analysis-ready access to prominent climate datasets from projects such as the European Eddy-Rich Earth System Models (EERIE) stored across heterogeneous storage tiers through standardized cloud-native interfaces, without requiring physical data reformatting or migration.

Within the EERIE project, kilometer-scale Earth System Models (~10 km atmosphere and ~5 km ocean) generate petabyte-scale output that exceeds the capabilities of traditional file-based access patterns. Cloudify addresses this challenge by emulating Zarr data originating from sources stored on file system or tape enabling efficient HTTP access to large datasets. This approach allows users to interact with HPC-resident data using established cloud-native tools and workflows.

The km-scale cloud offers several key advantages: (i) seamless bridging of HPC and cloud ecosystems, enabling interactive and scalable analysis without data duplication; (ii) analysis-ready data access, supporting chunk-based and parallel I/O patterns suited for modern data analytics and machine-learning workflows; (iii) improved data discoverability and reuse, facilitated by standardized interfaces and metadata services such as STAC catalogs; and (iv) lower entry barriers for external users, who can access large climate datasets without requiring direct HPC accounts or specialized system knowledge.

By deploying Cloudify as a data service, the km-scale cloud demonstrates a scalable pathway towards interoperable, cloud-enabled access to next-generation climate model output.

How to cite: Wachsmann, F. and Wickramage, C.: The Kilometer-Scale Cloud by DKRZ, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9485, https://doi.org/10.5194/egusphere-egu26-9485, 2026.

EGU26-9640 | Posters on site | ESSI2.3

A Service-Oriented Distributed Zarr Solution for Climate Data Access Across Heterogeneous HPC and Storage Infrastructures 

Mostafa Hadizadeh, Martin Bergemann, Etor Lucio Eceiza, Andrej Fast, and Christopher Kadow

Modern climate archives are increasingly distributed across heterogeneous storage systems, while analysis workflows are becoming more interactive, distributed, and cloud-native. Moreover, many high-performance computing centres (HPC) host large climate datasets on traditional file-based storage infrastructures, whereas computational resources are often located at different sites. This separation between data location and compute resources creates significant barriers to efficient, interactive, and scalable data access.

This situation calls for climate data access services that are scalable, flexible, and independent of specific client-side environments, while supporting common climate data formats such as NetCDF, GeoTIFF, Zarr, HDF5, and GRIB. Nevertheless, efficient remote access to large and heterogeneous climate archives remains a major bottleneck for modern scientific workflows.

We present aservice, the Freva Data Loader, which implements the logic required to open datasets from diverse storage backends and expose them as Zarr chunks  through a lightweight, web-friendly REST interface with modern authentication mechanisms.

The Freva Data Loader is implemented as a stateless, worker service exposing a REST interface for dataset access and Zarr endpoint generation. Upon receiving an authenticated request, the service resolves dataset metadata, opens the underlying data from the appropriate storage backend (e.g. POSIX file systems or object storage), and exposes the data as a Zarr-compatible, chunked stream. Authentication and authorisation are handled centrally using OAuth2, ensuring secure and controlled access across institutional boundaries. Requests are coordinated by a Loader component and distributed to worker instances via a message broker (Redis), enabling asynchronous execution and horizontal scalability.

The service decouples data access from client-side tooling and enables users and applications to access data stored on traditional posix HPC file  systems, tape archives, as well as cloud-based object storage through a unified Zarr interface. Instead of transferring complete files between data centres or downloading them in full, clients retrieve only the required data chunks on demand. Users and client applications can request chunked array access over the network and process data incrementally, supporting interactive exploration and scalable downstream computation using cloud-native, chunked storage semantics, while remaining compatible with existing analysis stacks based on Zarr, xarray, and Dask.

How to cite: Hadizadeh, M., Bergemann, M., Lucio Eceiza, E., Fast, A., and Kadow, C.: A Service-Oriented Distributed Zarr Solution for Climate Data Access Across Heterogeneous HPC and Storage Infrastructures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9640, https://doi.org/10.5194/egusphere-egu26-9640, 2026.

EGU26-9862 | Posters on site | ESSI2.3

GridLook: A browser-based ESM data-viewer 

Andrej Fast, Tobias Kölling, and Fabian Wachsmann

Earth System Models (ESMs) produce output on a wide range of structured and unstructured grids. However, exploring these heterogeneous datasets remains challenging, often requiring specialized software, access to high-performance computing resources, or time-consuming regridding to regular latitude-longitude grids that can introduce interpolation artifacts.

We present GridLook, an open-source, browser-based WebGL visualization tool that enables interactive exploration of cloud-hosted Zarr datasets directly on their native grids without any software installation. GridLook leverages the Pangeo ecosystem by consuming Zarr stores from any CORS-enabled cloud storage (including S3, Swift, and Google Cloud), making it immediately compatible with FAIR data principles and cloud-native workflows.

Key features include: (1) client-side GPU rendering; (2) automatic grid type detection from CF-compliant metadata, supporting a wide variety of grids; and (3) shareable URLs that encode the complete visualization state including dataset location, variable selection, and view parameters.

The architecture follows a serverless design in which all rendering occurs in the user’s browser, removing the need for backend infrastructure and enabling real-time interaction with large datasets through Zarr's chunked access patterns. By combining cloud-native data formats (Zarr), standardized metadata conventions (CF), and modern web technologies (WebGL), GridLook reduces time-to-plot and supports lightweight, shareable visualization workflows. This facilitates rapid visual inspection of model output by both data users and model developers, enabling quicker communication of spatial features and identification of potential bugs during model development.

The tool is freely available at https://gridlook.pages.dev with source code on GitHub, and we invite community contributions for additional grid types and features.

How to cite: Fast, A., Kölling, T., and Wachsmann, F.: GridLook: A browser-based ESM data-viewer, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9862, https://doi.org/10.5194/egusphere-egu26-9862, 2026.

EGU26-10033 | ECS | Orals | ESSI2.3

Creating and visualising DGGS native data cubes with DGGS.jl 

Daniel Loos, Gregory Duveiller, and Fabian Gans

Discrete Global Grid Systems (DGGS) have emerged as a transformative approach to minimizing spatial distortions in geospatial data processing. They are not only used for geocoding, but also offer a highly efficient data structure due to the lack of tile overlap, as used in Sentinel-2 imagery and elsewhere. The performance of lookup operations on DGGS native data cubes is intrinsically linked to the cell index, which plays a crucial role in data management and retrieval. Most DGGS implementations utilize a hierarchical one-dimensional index to name and sort cells, optimizing them for parent-child queries like up and downsampling. However, many real-world applications, such as visualization or convolutions, require efficient handling of distant neighbour queries based on spatial distances.

Here, we present the tools DGGS.jl and DGGSexplorer to create and visualise DGGS native data cubes. Hereby, a three-dimensional index based on Icosahedral Snyder equal-area projection is utilized, enabling compact and efficient data cube arrays stored in the cloud-optimized Zarr format. Furthermore, we developed a XYZ Tile Map Server generating maps on the fly, allowing to view DGGS data in QGIS, in the web browser, and elsewhere.  This is especially helpful in integrating multi sensor data at different spatial resolutions while minimising spatial distortions and computational resources in all subsequent processing steps.

How to cite: Loos, D., Duveiller, G., and Gans, F.: Creating and visualising DGGS native data cubes with DGGS.jl, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10033, https://doi.org/10.5194/egusphere-egu26-10033, 2026.

EGU26-10862 | Orals | ESSI2.3

KuHOP: Kubernetes-Based Orchestration for Hybrid Earth Science Computing 

Layla Loffredo, Tim Kok, George Bampilis, and Marco Kerstens

Numerical Weather Prediction (NWP), Earth Observation (EO), and Earth System Modeling workflows commonly span high-throughput computing (HTC) and high-performance computing (HPC): EO ingestion and pre-processing on HTC, data assimilation and model execution on HPC, and verification back on HTC. Most infrastructures lack integrated mechanisms to coordinate these heterogeneous environments, leading to manual workflow orchestration and ad-hoc data transfers.

We present KuHOP (Kubernetes-orchestrated Hybrid Operations Platform), an architecture that applies cloud-native orchestration concepts to hybrid HTC/HPC workflows. KuHOP uses Kubernetes as a unified control plane to describe, schedule, and monitor workflows across heterogeneous environments while preserving existing SLURM schedulers through native job submission. Containerized services and Kubernetes operators translate declarative workflow specifications into scheduler-specific jobs and manage data movement between clusters, enabling consistent observability without replacing established resource managers.

By treating HTC and HPC systems as backends within a single orchestration framework, KuHOP aims to improve portability and reproducibility of hybrid workflows through version-controlled, declarative definitions. The modular design supports GPU-accelerated AI/ML components and envisions multi-tiered resource federation. The Kubernetes control plane also allows institutions to deploy complementary services such as data streaming pipelines, AI inference endpoints, and custom dashboards as their needs evolve.

Developed at SURF, the Dutch national digital infrastructure provider, KuHOP targets both operational and non-operational Earth science workflows where hybrid computing is essential. It offers institutions a practical path to automate HTC/HPC coordination without abandoning existing infrastructure or losing the specialized capabilities of each environment.

How to cite: Loffredo, L., Kok, T., Bampilis, G., and Kerstens, M.: KuHOP: Kubernetes-Based Orchestration for Hybrid Earth Science Computing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10862, https://doi.org/10.5194/egusphere-egu26-10862, 2026.

EGU26-12863 | Posters on site | ESSI2.3

A data lakehouse solution for geoscience workflows 

Robert Griffioen, Layla Loffredo, Robert-Jan Bood, Raymond Oonk, Els Kuipers, Oliver Schmitz, and Derek Karssenberg

Geoscience research faces enormous data growth, larger, more versatile datasets from satellites, IoT devices, and measurement instruments. To make full use of these data opportunities and the demand for integrated analysis, there is a need for new IT-solutions. SURF, the Dutch national digital infrastructure provider for research and education, is investigating a data lakehouse architecture in the context of an innovation project and the project of the SAGE European Green Deal Data Space (https://www.greendealdata.eu/). In SAGE we collaborate with geoscientists from the Department of Geography at Utrecht University in processing heterogeneous environmental monitoring datasets into data products for further research.  

The data lakehouse architecture combines the flexibility of a datalake for handling heterogeneous data and ML workflows with the properties of a database (ACID transactions) and the governance of data warehouses. We explore this architecture using SURF services, like the object store, and open-source software from existing geoscience ecosystems like Pangeo and Earthmover. The exact properties of the data lakehouse depend on the software packages used. We present the lakehouse solution for UU use-case of serving and publishing exposome data products. Currently, data-processing of the data products is handled by a batch service. We will discuss how the lakehouse architecture could be extended to both serve the resulting data products and cover the processing stage and subsequent analysis-workflows. 

How to cite: Griffioen, R., Loffredo, L., Bood, R.-J., Oonk, R., Kuipers, E., Schmitz, O., and Karssenberg, D.: A data lakehouse solution for geoscience workflows, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12863, https://doi.org/10.5194/egusphere-egu26-12863, 2026.

EGU26-14414 | Orals | ESSI2.3

Advancing the Research-to-Industry Continuum for Earth Observation AI in Europe via the AI-on-Demand Platform 

Antonis Troumpoukis, Mohanad Albughdadi, Ioannis Kakogeorgiou, Giorgos Petsangourakis, Theodoros Aivalis, Vasileios Vatellis, and Vasileios Baousis

Earth Observation (EO) and environmental research increasingly relies on AI methods that require access to large datasets, scalable cloud infrastructures, and high-performance computing (HPC) resources. At the same time, the transition of research outcomes into operational, industry-ready services remains challenging, often demanding substantial re-engineering of data pipelines, execution environments, and deployment models. This separation between research-oriented and industry-oriented infrastructures continues to limit the reuse, scalability, and real-world impact of EO innovations.

Addressing this gap, the European AI-on-Demand Platform (AIoD) [1] was recently expanded to support both research and industry within a unified digital infrastructure. The platform brings together research-driven AI assets (such as models, workflows, and datasets) with industry-grade tools and services for the development, training, and operationalisation of AI applications across cloud and HPC infrastructures, in an efficient and responsible manner. As a unified gateway, the AIoD connects previously fragmented resources across the European AI ecosystem, making them accessible, reusable, and adaptable to diverse user needs. In parallel, efforts are underway to explore interoperability with emerging European AI Factory initiatives, including PHAROS (the Greek AI Factory) [2], aiming to support future federated access to specialised AI computing resources.

We illustrate this approach through Earth Observation and environmental services and use cases that are jointly accessible to researchers and practitioners, including the mapping of sea surface features and marine pollutants, satellite image enhancement through super-resolution, and AI-based prediction and analysis of extreme weather events, enabling a seamless transition from experimentation and validation to scalable, operational deployment. These developments extend earlier work on European AI and Earth Observation convergence [3].

[1] http://aiodp.eu
[2] https://www.pharos-aifactory.eu
[3] A. Troumpoukis et al., European AI and EO convergence via a novel community-driven framework for data-intensive innovation. Future Gener. Comput. Syst. 160: 505-521 (2024) https://doi.org/10.1016/j.future.2024.06.013

This work has received funding from the European Union’s Digital Europe Programme (DIGITAL) under grant agreement No 101146490.

How to cite: Troumpoukis, A., Albughdadi, M., Kakogeorgiou, I., Petsangourakis, G., Aivalis, T., Vatellis, V., and Baousis, V.: Advancing the Research-to-Industry Continuum for Earth Observation AI in Europe via the AI-on-Demand Platform, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14414, https://doi.org/10.5194/egusphere-egu26-14414, 2026.

EGU26-14554 | Orals | ESSI2.3 | Highlight

GRID4EARTH: Toward an Ellipsoidal HEALPix Grid for Analysis-Ready Earth Observation and Climate Data 

Jean-Marc Delouis, Benoît Bovy, Anne Fouilloux, Alexander Kmoch, Justus Magin, Pablo Richard, Vincent Dumoulin, and Tina Odaka

The increasing volume and diversity of Earth Observation (EO) and climate data produced by Copernicus missions and the Destination Earth (DestinE) initiative pose a major challenge for interoperability and large-scale analysis. Today, global datasets are distributed on heterogeneous spatial grids, forcing users to repeatedly perform ad-hoc regridding steps, which are costly, error-prone, and difficult to reproduce.

The GRID4EARTH project addresses this issue by promoting a common Discrete Global Grid System (DGGS) as a foundation for analysis-ready Earth data. In this context, HEALPix emerges as a strong candidate due to its equal-area property, hierarchical structure, and long-standing adoption in global modelling and large-scale data analysis. These properties enable efficient multi-resolution workflows, scalable data access, and natural integration with modern cloud-native formats such as Zarr.

However, a key limitation remains: HEALPix is formally defined on the sphere, whereas EO are naturally referenced to the WGS84 ellipsoid. While often ignored at coarse resolutions, this mismatch introduces non-negligible area distortions at Copernicus resolutions and may bias zonal or regional analyses.

To overcome this limitation, GRID4EARTH explores anextension of HEALPix to the WGS84 ellipsoid using the authalic sphere associated with the ellipsoid. By preserving equal-area properties on the ellipsoid, this approach provides a consistent spatial framework bridging spherical climate models and ellipsoidal EO. It enables a unified representation for DestinE model outputs and Copernicus satellite data, while remaining compatible with existing HEALPix-based tools and workflows.

This contribution presents the motivation, principles, and expected benefits of ellipsoidal HEALPix within GRID4EARTH, and discusses its role as a practical and scalable DGGS for next-generation Earth system data infrastructures.

How to cite: Delouis, J.-M., Bovy, B., Fouilloux, A., Kmoch, A., Magin, J., Richard, P., Dumoulin, V., and Odaka, T.: GRID4EARTH: Toward an Ellipsoidal HEALPix Grid for Analysis-Ready Earth Observation and Climate Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14554, https://doi.org/10.5194/egusphere-egu26-14554, 2026.

EGU26-17943 | ECS | Posters on site | ESSI2.3

Modeling orthorectification and PSF distortions on a HEALPix grid 

Pablo Richard, Jean-Marc Delouis, Justus Magin, and Tina Odaka

The quality of analysis-ready Earth Observation (EO) products strongly depends on the ability of the processing chain to accurately model the mapping from the Earth’s surface to the detector geometry. This mapping involves a convolution with the instrumental Point Spread Function (PSF) and an orthorectification step that corrects for terrain-induced geometric distortions using a Digital Elevation Model (DEM).

These distortions are challenging, as they can lead to spatially varying PSF deformations. Even worse, some areas with strong topographic gradients lead to an effective PSF with multiple modes and may cause the failure of standard operational orthorectification algorithms.

To anticipate these failures, we introduce critical incidence maps. For a given point on a DEM, such a map provides the maximum sensor incidence angle at which this point remains visible, i.e. not occluded by surrounding terrain in any azimuthal direction. We show that, for moderate incidence angles (typically below ~20°), rugged areas responsible for orthorectification failures cover only a very small fraction of Earth’s surface, leaving therefore ample room for more robust algorithms to tackle these thorny points.

For smooth regions, we introduce and compare various semi-analytical orthorectification schemes that achieve appealing trade-offs between computational cost and geometric precision. We then combine two distinct orthorectification strategies, tailored respectively to smooth and rugged terrain, and express the overall ground-to-detector mapping as a sparse linear operator.

This formulation yields an efficient forward model that accurately captures terrain-induced PSF distortions, including multimodality. Finally, we apply this model to invert the data projection, from the detector to the Earth’s surface, in the context of a HEALPix discrete grid.

How to cite: Richard, P., Delouis, J.-M., Magin, J., and Odaka, T.: Modeling orthorectification and PSF distortions on a HEALPix grid, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17943, https://doi.org/10.5194/egusphere-egu26-17943, 2026.

EGU26-18064 | ECS | Orals | ESSI2.3

Virtual Zarr for Ensemble Prediction Systems: VirtualiZarr Custom Parsers for Cloud-Native GRIB Access  

Hillary Koros, Nishadh Kalladath, Max Jones, Sean Harkins, Jason Kinyua, Mark Lelaono, Ezra Limo, Masilin Gudoshava, and Ahmed Amdihun

Virtual Zarr for Ensemble Prediction Systems: VirtualiZarr Custom Parsers for Cloud-Native GRIB Access 

Hillary Koros, Nishadh Kalladath, Max Jones, Sean Harkins, Jason Kinyua, Mark Lelaono, Ezra Kiplimo Masilin Gudoshava and Ahmed Amdihun 

IGAD Climate Prediction and Applications Centre, Nairobi, Kenya 

Development Seed, United States of America 

 

Global Ensemble Prediction Systems (EPS) from ECMWF and NOAA such as IFS, GEFS generate petabyte-scale datasets essential for early warning systems, probabilistic forecasting, and AI/ML weather applications. However, the GRIB format designed for efficient archival storage—resists cloud-native random access patterns. Converting archives to Analysis Ready Cloud Optimized (ARCO) formats would require prohibitive storage duplication. Virtual Zarr datasets enabled by Virtualizarr library offer a transformative alternative: lightweight reference layers exposing original GRIB files through cloud-native interfaces without data conversion. 

This approach creates a win-win-win solution. Data producers maintain GRIB files without additional processing. Cloud providers serve data efficiently through byte-range requests. End users access ensemble forecasts via familiar tools (xarray, Dask) as if data were in Zarr format. Previous work on Grib-Index-Kerchunk (https://github.com/icpac-igad/grib-index-kerchunk ) method demonstrated this paradigm by exploiting a critical insight: GRIB index files (.idx text for GEFS, .index JSON for ECMWF) contain all byte offset information needed for virtual reference creation. Rather than scanning entire corpus of GRIB files— compute expensive at ~2,400 files per GEFS run or ~85 files of 5GB each for ECMWF—the GIK method reads only lightweight index files (~KB/ few MB each) plus 1-2 sample GRIB files to extract metadata structure. This achieves regional data access with less than 5% of original GRIB data read. 

 Building on this foundation, we develop GEFS and ECMWF custom parsers following the VirtualiZarr Parser protocol with native Zarr v3 ArrayBytesCodec using gribberish, a Rust-based decoder delivering order-of-magnitude performance improvements. Following HRRRparser (https://github.com/virtual-zarr/hrrr-parser ) patterns, our parsers construct chunk manifest store. Virtual references persist to Icechunk transactional storage following zarr specification, enabling version-controlled datasets where chunks reference original GRIB bytes. The resulting stores integrate with xarray and Dask for parallel ensemble processing across 30-51 members and 85+ forecast timesteps. 

For regional climate centers, this replaces custom pipelines with community-extensible parsers. By contributing GEFS or IFS product-specific custom parsers to VirtualiZarr, we transform operational necessity into reusable infrastructure—enabling cloud-native ensemble access: `xr.open_zarr("icechunk://gefs")`. 

How to cite: Koros, H., Kalladath, N., Jones, M., Harkins, S., Kinyua, J., Lelaono, M., Limo, E., Gudoshava, M., and Amdihun, A.: Virtual Zarr for Ensemble Prediction Systems: VirtualiZarr Custom Parsers for Cloud-Native GRIB Access , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18064, https://doi.org/10.5194/egusphere-egu26-18064, 2026.

EGU26-18208 | ECS | Posters on site | ESSI2.3

Ellipsoidal HEALPix for the Earth sciences: healpix-geo and xdggs integration into the Pangeo ecosystem 

Justus Magin, Benoît Bovy, Pablo Richard, Jean-Marc Delouis, and Tina Odaka

Discrete Global Grid Systems (DGGS) and in particular HEALPix have become increasingly popular in the Earth sciences over the past few years, mainly due to its equal-area nature and readily available python libraries. However, this adoption and the ever increasing amounts of data come with its own set of challenges. In particular, the existing python libraries are written for use in astronomy and thus only work on a sphere, resulting in small but often non-negligible variations in the cell areas when applied to the surface of the earth.

Additionally, a large spatial coverage at high resolutions require a large amount of memory just to represent the spatial information in memory (e.g. roughly 100GB for full Earth coverage at 100 m resolution).

Finally, the storage format of HEALPix was not standardized until very recently with the release of the CF conventions version 1.13 and the upcoming zarr DGGS convention.

We present healpix-geo, a HEALPix implementation library for python built on top of the cdshealpix, moc, and geodesy rust crates with minimal python dependencies (numpy and optionally shapely). It supports the most common HEALPix indexing schemes (nested, ring, zuniq), allows the conversion of cell indices to and from ellipsoidal coordinates, and contains a range-based data structure suitable to index a large amount of cells with a small memory footprint.

We further show how healpix-geo integrates with xdggs, an xarray extension that enables high-level interaction with DGGS datasets, including efficient subsetting, analysis-ready representations, and visualization within Pangeo workflows. 

 xdggs also provides an extensible mechanism to easily import/export DGGS data from/to a variety of models or conventions, with built-in support of the CF HEALPix conventions and the zarr DGGS conventions. Together, healpix-geo and xdggs provide an end-to-end, standards-aligned pathway for scalable HEALPix-based geospatial analysis on the ellipsoid.

How to cite: Magin, J., Bovy, B., Richard, P., Delouis, J.-M., and Odaka, T.: Ellipsoidal HEALPix for the Earth sciences: healpix-geo and xdggs integration into the Pangeo ecosystem, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18208, https://doi.org/10.5194/egusphere-egu26-18208, 2026.

EGU26-18770 | ECS | Posters on site | ESSI2.3

Energy Efficiency in Cloud-Based Earth Observation Data Processing: Gap Analysis and Research Directions 

Adhitya Bhawiyuga, Serkan Girgin, Rolf A. de By, and Raul Zurita-Milla
The Earth observation (EO) community has increasingly adopted cloud platforms for processing large datasets and EO data archives grow by approximately 100 PB annually. However, energy costs and environmental footprint of this processing remain largely invisible. This oversight is particularly contradictory for a community focused on environmental monitoring and climate mitigation. In this study, we present a gap analysis of energy awareness and energy efficiency in cloud-based EO data processing, using Pangeo's Kubernetes-based architecture as a case study. Through literature review and architectural analysis, we identify five interconnected problems that prevent energy-efficient cloud operations in the EO domain.

According to our analysis, the most critical gap is the absence of granular energy observability. While Pangeo deployments on self-managed Kubernetes can access resource metrics, e.g. through Prometheus, they lack energy attribution at the task level. Tools like Kepler provide pod-level power estimates on bare-metal infrastructure but face limitations in virtualized cloud environments where hypervisors restrict hardware sensor access. On fully managed cloud platforms, provider transparency worsens the problem as they offer only monthly service-level carbon footprints. Without this visibility, researchers could optimize workflows solely based on execution time and cost, leaving energy efficiency as an invisible dimension. Furthermore, the EO community lacks standardized benchmarking frameworks for evaluating energy-performance trade-offs in realistic workflows. Researchers reporting energy improvements for specific algorithms cannot provide reproducible comparisons, as different studies use varying datasets, baseline systems, and measurement methodologies.

From system-level perspective, current Kubernetes orchestration policies optimize for resource availability and load balancing but ignore hardware-specific energy profiles. Pangeo deployments consequently distribute workloads across multiple underutilized nodes rather than consolidating them to enable node shutdown. Similarly, Dask schedulers prioritize data locality and workload balance but cannot incorporate energy awareness when assigning tasks. When processing continent-scale mosaicking operations, schedulers could mismatch task characteristics with hardware capabilities by assigning compute-intensive operations to high-power nodes when energy-efficient alternatives could handle the workload.

In order to address these interconnected gaps, we propose a multi-phase research roadmap. The first phase should focus on developing energy monitoring toolkits that synthesize hardware sensors with application profiling and modeling frameworks to account for hidden energy consumption in unmeasured components such as disk I/O and network peripherals. This phase should also establish standardized benchmarking frameworks comprising representative EOBD workflows to enable reproducible energy-performance evaluation across different platforms and algorithms. Building on this measurement infrastructure, subsequent phases should develop predictive models that estimate task-level energy consumption from workflow characteristics and hardware specifications before execution takes place. This model will enable proactive decisions about algorithm selection, hardware provisioning, and resource allocation. The final phase focuses on system-level optimization by designing energy-aware Kubernetes orchestration through workload consolidation and heterogeneous hardware selection. This phase also includes developing multi-objective task schedulers for distributed frameworks like Dask that co-optimize energy consumption, execution time, and cost when assigning tasks to worker nodes. These directions aim to make energy consumption a measurable, optimizable metric in cloud-based EO processing, aligning computational practices with environmental sustainability goals.

How to cite: Bhawiyuga, A., Girgin, S., de By, R. A., and Zurita-Milla, R.: Energy Efficiency in Cloud-Based Earth Observation Data Processing: Gap Analysis and Research Directions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18770, https://doi.org/10.5194/egusphere-egu26-18770, 2026.

EGU26-18919 | Posters on site | ESSI2.3

JupyterGIS: A Flexible, Open-Source Platform for Geospatial Analysis 

Sylvain Corlay, Matthias Meschede, Martin Renou, Gregory Mooney, and Arjun Verma

JupyterGIS is an open-source web-GIS (Geographic Information System) designed to bring the iterative and interactive workflows of Jupyter to geospatial data analysis. By leveraging the Jupyter ecosystem, it seamlessly interleaves code and visualization, providing access to the vast range of existing geospatial libraries and interfaces.

The architecture of JupyterGIS is based on a single, serializable JSON project document that encapsulates all project information. This document is
implemented as a collaborative Conflict-free Replicated Data Type (CRDT), a "ydoc," ensuring real-time synchronization when edited by multiple instances or components simultaneously. This design enables teams to collaboratively work on geospatial data in real time, a feature particularly valuable for organizations. Additionally, it opens possibilities for coediting with LLM-based AI agents, greatly expanding the potential for automation and advanced analysis.

JupyterGIS offers very flexible deployment options. It can run on high-performance backend servers, including scalable Kubernetes clusters, to handle large-scale datasets and computationally intensive tasks, such as those commonly encountered in Earth Observation applications. Or, it can be deployed as a static website via WebAssembly and JupyterLite, executing computations directly in the user's browser. The latter eliminates the need for any backend infrastructure, making JupyterGIS suitable for creating embeddable, highly scalable, and accessible applications, such as lightweight embedded maps.

Young, initiated in 2024, JupyterGIS is a rapidly growing project that has garnered significant attention, community contributions, and organizational support, bundled in the Pangeo and GeoJupyter initiatives. As a fully open-source and sovereign solution, it provides a self-hostable alternative to proprietary platforms. This is particularly advantageous for handling sensitive data, as all components are auditable and under the user's control. Its modular and extensible architecture also ensures easy integration into existing systems and adaptability to new use cases.

JupyterGIS thus serves multiple roles for working with geospatial data: as a local or remote Integrated Development Environment (IDE), as an interface integrated into large-scale organizational portals, and as an embedded solution for small maps and web applications.

Our overview of JupyterGIS will include its underlying architecture, showcase the UI and features with examples, comparing its strengths and weaknesses to other platforms. The goal is to provide a comprehensive understanding of this novel tool, enabling listeners to assess its applicability to their use cases and to guide them on how to get started.

How to cite: Corlay, S., Meschede, M., Renou, M., Mooney, G., and Verma, A.: JupyterGIS: A Flexible, Open-Source Platform for Geospatial Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18919, https://doi.org/10.5194/egusphere-egu26-18919, 2026.

EGU26-19899 | Orals | ESSI2.3

Cloud-based orchestration of the biogeochemical forecasting system for Italian seas within the MER project 

Jacopo Nespolo, Matteo Poggi, Cecilia Zagni, Alberto Pastorutti, Stefano Querin, Giorgio Bolzon, Stefano Piani, Fabio Di Sante, Gian Franco Marras, Gabriella Scipione, Antonello Bruschi, and Francesca Catini

We present the biogeochemical forecasting system developed within the MER (Marine Ecosystem Restoration) project (Actions B32-B35). This case study leverages cloud-based workflow orchestration and traditional HPC systems to deliver daily operational marine biogeochemistry forecasts for Italian seas, as a downscaling of the Copernicus Marine Service (CMS).

The basins are divided into 7 regional high-resolution domains at ~500 m resolution and further 10 selected very high-resolution nested sites at ~100 m resolution. The downscaling pipelines we implemented are responsible for retrieving heterogeneous input data from multiple third-party sources (CMS, EFAS, ItaliaMeteo, ECMWF), their preprocessing to feed the MITgcm-BFM coupled physical-biogeochemical model, the postprocessing of the outputs and the publication of the final products. The implementation further provides observability, failsafes and fallbacks in case of missing data, and notifications regarding the status of operations.

Such a complex operational oceanographic system faces competing requirements: on one hand, computationally intensive numerical simulations demand HPC resources. On the other, the orchestration of several interdependent extract-transform-load workflows whilst guaranteeing monitoring and observability require capable management systems. These are often incompatible with HPC cluster policies (e.g., length of standing processes, security, …) and better suited for a cloud environment. On top of this, care must be taken to manage large volumes of data between the orchestrator and the HPC cluster.

We address these competing requirements through a hybrid architecture that combines cloud computing with HPC infrastructures for workflow orchestration and compute-intensive simulations, respectively. Our system, rewritten following software engineering best practices (modular architecture, separation of concerns, CI testing, …), employs Apache Airflow as the workflow manager, deployed in a fully containerised fashion on CINECA's OpenStack-based cloud infrastructure. A custom integration layer allows interfacing with the Slurm workload manager, offloading computationally intensive tasks onto CINECA's Leonardo HPC cluster. Parallel computing and distributed filesystems are efficiently exploited through modern technologies, particularly the cloud-native Zarr data format in conjunction with xarray and dask as Python-based numerical computing libraries.

Our setup demonstrates the viability of hybrid cloud-HPC architectures for operational Earth system modelling. It meets efficiency and scalability goals that would be challenging with either infrastructure alone. The software is planned to be open-sourced in the second half of 2026.

This work is developed by eXact lab Srl in partnership with OGS and CINECA within the MER project, led by ISPRA, funded by the NextGenerationEU program (Italian National Recovery and Resilience Plan, investment M2C4 ‐ I3.5).

How to cite: Nespolo, J., Poggi, M., Zagni, C., Pastorutti, A., Querin, S., Bolzon, G., Piani, S., Di Sante, F., Marras, G. F., Scipione, G., Bruschi, A., and Catini, F.: Cloud-based orchestration of the biogeochemical forecasting system for Italian seas within the MER project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19899, https://doi.org/10.5194/egusphere-egu26-19899, 2026.

EGU26-21049 | Posters on site | ESSI2.3

Advancing cloud-native data access and processing for Natural and Engineering Sciences: CLOUD-NES 

Serkan Girgin, Francesco Nattino, Martin Brandt, and Maarten Plieger

Data accessibility is crucial in modern research across the Natural and Engineering Sciences (NES), including Geosciences, and is central to the push toward Open Science. Yet, accessing and efficiently processing rapidly growing datasets, such as Earth-related spatiotemporal data, remains challenging as sources diversify and collection frequency increases. Most of these datasets are hosted on the Cloud, and cloud-native data access and processing are ramping up as modern digital competences. Cloud-based processing is especially beneficial because bringing computation close to the data boosts efficiency and reduces analysis time. Despite this, many researchers still rely on the inefficient approach of downloading data for local analysis. Sometimes this is unavoidable because the data is not provided in cloud-friendly formats, but often it also reflects a lack of skills for cloud-based access and processing. A similar problem occurs in data publishing, where research datasets are frequently shared in formats that impede efficient cloud access and interoperability, even though cloud-optimized formats could be used at no additional cost.

The CLOUD-NES project, funded by the Dutch Research Council (NWO) via the Thematic Digital Competence Centre NES (TDCC-NES), aims to advance cloud-native tools and workflows for publishing, accessing, and processing research data in the Netherlands. The project demonstrates the benefits of cloud-native approaches through reproducible performance benchmarks and equipes researchers with practical training to strengthen digital competencies. A public cloud-native data repository with co-located analysis capabilities is being developed, featuring object-based scalable storage and STAC-compliant data catalog, and hosting selected datasets from large-scale geospatial data providers in the Netherlands such as PDOK and KNMI, transformed into cloud-optimized formats. Through iterative benchmarking, we are assessing the performance of cloud-native storage formats, access patterns, and analysis workflows, generating reproducible evidence of efficiency gains to support community adoption. All infrastructure, ingestion pipelines, and benchmarking code will be open-source, accompanied by detailed guidelines and documentation. To further accelerate adoption, domain-specific open training materials will be developed and hands-on workshops for researchers and data providers will be organized. Training covers cloud-native data access, workflow design, dataset publishing, and infrastructure deployment, using common domain-specific workflows as case studies. Community events and mini symposia will foster community building and knowledge exchange, while lessons learned and best practices will be disseminated nationally and internationally.

By combining demonstrable benchmarks, practical training, and clear guidance for data providers, CLOUD-NES aims to accelerate the adoption of cloud-native research practices across the Dutch research community and beyond, improving efficiency, reproducibility, and accessibility of large, complex datasets. This presentation provides an overview of the CLOUD-NES project, covering the design and operation of its reproducible cloud-native benchmarking framework and the structure of its open training materials. Planned project activities, including community-building events and mini-symposia on effective cloud-native practices, will also be highlighted.

How to cite: Girgin, S., Nattino, F., Brandt, M., and Plieger, M.: Advancing cloud-native data access and processing for Natural and Engineering Sciences: CLOUD-NES, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21049, https://doi.org/10.5194/egusphere-egu26-21049, 2026.

EGU26-21395 | Orals | ESSI2.3

From Disparate Datasets to Analysis-Ready Data Cubes with Pangeo on EarthCODE 

Krasen Samardzhiev, Deyan Samardzhiev, Anca Anghelea, and Ewelina Dobrowolska

The EarthCODE Open Science Catalog (https://opensciencedata.esa.int/catalog) contains over 300 data products at this moment, most of them the result of peer-reviewed scientific research. Currently, these exist as disparate individual datasets, mostly grouped under themes or variables. This fragmentation creates a barrier to interoperability, where a scientist has to manually combine these datasets—for example reprojecting, regridding, or temporally resampling heterogeneous data.

 

EarthCODE is creating a new category of products-combined data cubes for each of the Open Science Catalog’s themes-to streamline access for science researchers and ensure the data is truly "Analysis-Ready" (ARD). Combining the data products into a single grid and a single projection will drastically reduce researcher overhead needed to harmonize the appropriate datasets. This workflow focuses on the combination of different datasets and collaborating with scientists to curate the appropriate data and to minimise disruption during the transformation process, since any reprojection or regridding introduces uncertainties.

 

We demonstrate the efficacy of this Pangeo-aligned workflow through the Antarctica InSync project (https://discourse-earthcode.eox.at/t/antartica-insync-data-cubes/107). This was a multi-stage pipeline that included close collaboration with the scientific community. The first step was aggregating the relevant Antarctic datasets. This step by itself is important, since it centralizes domain knowledge and ensures the Open Science Catalog contains the latest datasets relevant to the research community.

 

The second step involved processing the data using cloud-native tools to convert it to the same projection, common grid, and in some cases the same resolution (creating coherent STAC Collections). The third step involved the generation of detailed metadata at the variable level for all datasets to ensure high Findability and Reusability. Furthermore, we also provide the visualisation tools to explore the data cube via cloud-optimized formats, without downloading it, in addition to a discussion forum. To foster open science and reproducibility, our accompanying library will contain all generalizable functions that were used to generate this data, allowing the community to reuse these workflows for other domains.

How to cite: Samardzhiev, K., Samardzhiev, D., Anghelea, A., and Dobrowolska, E.: From Disparate Datasets to Analysis-Ready Data Cubes with Pangeo on EarthCODE, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21395, https://doi.org/10.5194/egusphere-egu26-21395, 2026.

EGU26-21900 | Posters on site | ESSI2.3

Supporting remote access to HDF5 datasets 

David Hassell, Valeriu Predoi, Bryan Lawrence, Ezequiel Cimadevilla, and Kai Mühlbauer

Programmatic access to remote high-volume multi-dimensional geophysical data was nearly impossible before the advent of high-speed networks and public cloud storage. Even then, data often had to be made "analysis-ready" before such access was possible. However, once analysis ready data is available, remote access becomes possible, with only the bytes needed by the client transferred across a network to the local client. In many cases such access will be faster and more energy efficient than downloading the entire dataset that contains the relevant variables (or parts of variables). Additionally, even when it is not more efficient than downloading data on a case-by-case basis, it may not be possible to locally cache the data, and remote access may be the only possibility. Hence, the notion of analysis ready data has become very popular, and this has often been understood to mean "made available on an object store in Zarr format". However, the key aspects of analysis ready data can be delivered via other interfaces and formats, provided the right software stack is available.  Here we present such a stack in the context of how we expect to enable remote access to NetCDF4 data from the upcoming CMIP7 Assessment Fast Track (and other data to be held in the newly upgraded Earth System Grid Federation, ESGF). The new ESGF will expose data via http servers which will support remote range-get to portions of files, which essentially provides the same remote access capabilities as an object store. The requirements for using such a stack, for the new ESGF, and for object stores  are (1) the data to be accessed must be appropriately chunked (partitioned into suitably dimensioned hyperslabs), (2) the chunk indices must be efficiently stored, and (3) the reading software using tools such Dask must be fully parallelisable. If either of the first two criteria are not met, data access can be impossibly slow even for relatively small problems, and if the third is not met, large problems cannot be efficiently addressed.  To address the first two issues, we present:  `cmip7-repack` a tool to ensure that key aspects of the  CMIP7 data are chunked appropriatel
y; `pyfive`,  a pure-Python thread-safe library for reading HDF data performantly in both serial and parallel applications; and a `pyfive`-enabled version of the `h5netcdf` library for facilitating remote and/or parallel data access using the NetCDF4 API.  With these tools we are able to show that reformatting data from the NetCDF4 data preferred by modellers into additional formats such as Zarr, and/or maintaining duplicate copies of chunk indices made by tools such as kerchunk, will no longer be necessary for most workloads.

How to cite: Hassell, D., Predoi, V., Lawrence, B., Cimadevilla, E., and Mühlbauer, K.: Supporting remote access to HDF5 datasets, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21900, https://doi.org/10.5194/egusphere-egu26-21900, 2026.

EGU26-1703 | Orals | ESSI2.5

Innovations and Future Directions in the AuScope Virtual Research Environment 

Jens Klump, Alex Hunt, Vincent Fazio, and Pavel Golodoniuc

The AuScope Virtual Research Environment (AVRE) is a platform advancing geoscience research through integrated data access, analysis, and interoperability. Having evolved from the original AuScope GRID, AVRE now underpins the Data Lens of the Downward-Looking Telescope (DLT) that describes AuScope as a national research infrastructure for the geosciences. AVRE gives researchers a unified access to geoscience data from other AuScope programmes and from the government geological survey organisations.

Looking forward, AVRE is enhancing the findability, accessibility, and interoperability of its dataset catalogue through a new Python package and QGIS plugin. Significant additions to the AVRE services portfolio are the AuScope Data Repository, Sample Repository, Instrument Register, and Research Activity Identifier (RAiD) Register. The findability of resources will be improved by implementing natural language search powered by large language models. These innovations, together with continued integration of new catalogues and repositories, alongside robust user engagement and analytics, will ensure AVRE remains a cornerstone for collaborative, data-driven geoscience in Australia.

How to cite: Klump, J., Hunt, A., Fazio, V., and Golodoniuc, P.: Innovations and Future Directions in the AuScope Virtual Research Environment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1703, https://doi.org/10.5194/egusphere-egu26-1703, 2026.

EGU26-3749 | ECS | Posters on site | ESSI2.5

Galaxy for Earth System Sciences: An Open Platform for Analysis, Sharing, and Training 

Marie Jossé, Pauline Seguineau, Yvan Le Bras, Jérôme Detoc, Erwan Bodéré, Eric Lecaude, Sylvain Grellet, and Karim Ramage

The Earth System is a complex and dynamic system that encompasses the interactions between the atmosphere, oceans, land, and biosphere. Thus, Earth system science relies on heterogeneous data. It addresses key scientific questions such as land-atmosphere interactions but also studies the impacts of climate change. These studies require the integration of multiple datasets and methods, often across disciplinary boundaries. In practice, however, workflows are frequently implemented using locally installed tools, obsolete scripts and isolated computing environments, making analyses difficult to reproduce, share and reuse.

Galaxy addresses these needs by providing a Virtual Research Environment. It’s an open, comprehensive, and sustainable web platform for understanding and analyzing data. This platform was tailored to Earth science studies and it’s called Galaxy for Earth System Sciences (GESS https://earth-system.usegalaxy.eu/  or https://earth-system.usegalaxy.fr/ ).  Galaxy enables users to access data, tools and computing resources, allowing them to construct, execute and share analysis workflows without requiring programming skills. GESS extends the Galaxy framework by integrating tools, data formats and workflows commonly used in Earth system sciences, covering various scientific domains related to the study of climate, atmosphere, oceans, land surfaces and biosphere processes.

A main advantage of Galaxy lies in its workflow-based approach. Scientific analyses are processed as workflows that capture all steps, parameters and software versions, ensuring reproducibility and transparency. These workflows can be reused, adapted and shared enabling collaboration. GESS supports the execution of large datasets analysing workflows on distributed computing infrastructures removing all technical difficulties for the end user.

To facilitate the use and understanding of Galaxy, a structured collection of training materials has been developed to help users in adopting the platform and good practices in Earth system data analysis. These tutorials go from introductions to Galaxy concepts (data management, workflow construction, reproducibility) to domain-specific examples based on Earth science use cases. By combining hands-on tutorials with executable workflows, GESS provides a practical learning environment that supports both individual skill development and community-wide effort.

This presentation provides an overview of Galaxy for Earth System Sciences. We will present the platform, its representative tools and workflows, and the associated training ecosystem. Finally, we’ll show some lessons learned from deploying GESS, and perspectives for further development to support Earth system science.

How to cite: Jossé, M., Seguineau, P., Le Bras, Y., Detoc, J., Bodéré, E., Lecaude, E., Grellet, S., and Ramage, K.: Galaxy for Earth System Sciences: An Open Platform for Analysis, Sharing, and Training, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3749, https://doi.org/10.5194/egusphere-egu26-3749, 2026.

EGU26-3916 | ECS | Orals | ESSI2.5

ISOTOPE STUDIO: A Virtual Research Environment for Standardized Isotope Data Management and Modelling 

Paolo Di Giuseppe, Simona Gennaro, Erico Perrone, Eugenio Trumpy, Samuele Agostini, Marco Procaccini, and Antonello Provenzale

ISOTOPE STUDIO is a web application provided by the Isotope Virtual Research Environment (ISOTOPE VRE) and developed in the frame of the Italian Integrated Environmental Research Infrastructures System (ITINERIS) project. The ISOTOPE VRE, powered by the D4Science Digital Research Infrastructure, adheres to Open Science and FAIR principles, supporting transparency, collaboration, and inclusivity throughout the entire research data lifecycle, and offering analytical tools and harmonized data practices.

A key feature of ISOTOPE STUDIO is the dynamic data harmonization system, designed to address the heterogeneity of geochemical and isotopic datasets. User-submitted data, which can be generally different in format and structure, are assimilated into ISOTOPE STUDIO as standardized, searchable, and freely downloadable format, while the original ones are stored without any alteration. Harmonization is applied during data presentation ensuring consistent metadata annotation and enhancing the reliability of workflows and to perform different modelling. In this light, the key tool of this process is the dedicated data submission template, which represent a first attempt in proposing an international standard for isotope data. Although no global standard currently exists, ISOTOPE STUDIO proposes this model as a starting point that can be updated and improved over time. This template is easy to fill in and specifically designed to accommodate a wide variety of large and complex geochemical and isotopic datasets. For heterogeneous datasets, the harmonization system dynamically re-organizes them according to the template structure, ensuring consistency and interoperability. By guiding users through harmonized data entry, the template promotes transparency, reusability, and inclusivity across different research domains.

Built on three-layer architecture (relational database, REST APIs, and web interface), the ISOTOPE VRE also integrates detailed metadata describing sample characteristics and analytical quality.

Beyond harmonization, ISOTOPE STUDIO provides versatile modelling tools for the analysis of diverse geochemical and isotopic datasets which include binary and ternary plots, normalized spider diagrams, and mixing models otherwise not possible without the harmonization process. ISOTOPE STUDIO accommodates a wide range of geochemical data, including major and trace elements, intensive parameters (e.g., pressure and temperature), and isotopic compositions of diverse type of matrices.

How to cite: Di Giuseppe, P., Gennaro, S., Perrone, E., Trumpy, E., Agostini, S., Procaccini, M., and Provenzale, A.: ISOTOPE STUDIO: A Virtual Research Environment for Standardized Isotope Data Management and Modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3916, https://doi.org/10.5194/egusphere-egu26-3916, 2026.

Contemporary Earth and space science relies on complex infrastructures that connect observations, data transmission, processing systems and digital services. A critical but often under-recognized component of this system is access to radio-frequency spectrum, which enables both the acquisition of observations and the real-time exchange of data from meteorological satellites, weather radars, radiosondes, space weather sensors and other Earth observation systems. The radio-frequency spectrum is a physically limited and increasingly contested resource.

The World Meteorological Organization (WMO), a specialized agency of the United Nations with 193 Members, provides the global coordination framework for observing and data systems through the WMO Integrated Global Observing System (WIGOS), the WMO Information System (WIS) and the WMO Integrated Processing and Prediction System (WIPPS). These systems support standardized observations, global data exchange and the delivery of operational services that underpin numerical weather prediction, climate monitoring, hydrology and environmental applications worldwide.

Within this framework, the WMO Space Programme coordinates international activities related to the availability and use of satellite data and products, capacity development, space weather coordination, and cooperation on radio-frequency spectrum use. This includes engagement with scientific and regulatory communities through the Expert Team on Radio-Frequency Coordination, contributions to international technical studies, development of joint guidance (e.g., WMO–International Telecommunication Union (ITU) handbooks), and coordinated preparations for the World Radiocommunication Conference 2027 (WRC-27).

This presentation frames spectrum coordination as a core element of Earth observation data systems and highlights its role in maximizing the economic, social and environmental value of global meteorological infrastructures, including societally critical initiatives such as Early Warnings for All (EW4All).

How to cite: Donoho, N.: Radio-Frequency Spectrum for Earth Observation Data Systems: International Coordination toward WRC-27, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5273, https://doi.org/10.5194/egusphere-egu26-5273, 2026.

EGU26-7865 | Orals | ESSI2.5

The ENVRI-Hub: A Platform for Advancing Environmental and Earth Sciences through Integrated Virtual Research Environments 

Ulrich Bundke, Daniele Bailo, Claudio Dema, Dario De Nart, Delphine Dobler, Federico Drago, Marta Gutierrez David, Anca Hienola, Andreas Petzold, Alex Vermeulen, and Zhiming Zhao

Modern Environmental and Earth sciences demand seamless integration across atmospheric, marine, terrestrial, and biodiversity data, and this process is often hindered by disciplinary silos. The ENVRI-Hub addresses this challenge directly by serving as the unified Virtual Research Environment (VRE) for Europe's Environmental Research Infrastructures. It moves beyond a simple data portal to function as an integrated platform where discovery, access, and analysis converge.

The hub provides researchers with a centralised gateway to discover and access FAIRified research assets, enabled for cross-disciplinary work. Crucially, it enables these assets to be leveraged in situ through any VRE using a unified machine actionable API/toolset. , in order to support data analytics in scientific workflows. VREs will allow users to compose, execute, and share reproducible analytical pipelines - from accessing Essential Climate Variables (ECVs) to running complex AI analytics. This architecture not only streamlines the scientific process but also underpins applications like environmental Virtual Laboratories and foundational work for future applications like Digital Twins.

This presentation will detail the ENVRI-Hub's technical architecture for enabling VRE support. We will demonstrate, through specific scientific use cases, how its Catalogue of Services and AI-powered Knowledge Base work synergistically to reduce data friction. The contribution will highlight how this integrated environment supports workflow builders in creating robust, cross-domain analyses, thereby accelerating scientific results and advancing collaborative, data-driven Environmental and Earth science.

How to cite: Bundke, U., Bailo, D., Dema, C., De Nart, D., Dobler, D., Drago, F., Gutierrez David, M., Hienola, A., Petzold, A., Vermeulen, A., and Zhao, Z.: The ENVRI-Hub: A Platform for Advancing Environmental and Earth Sciences through Integrated Virtual Research Environments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7865, https://doi.org/10.5194/egusphere-egu26-7865, 2026.

EGU26-9690 | Posters on site | ESSI2.5

Land and Marine earth science applications within a Downstream Virtual Research Environment 

Rachele Franceschini, Nydia Catalina Reyes Suarez, Alessandro Altenburger, Giuliana Rossi, and Alessandra Giorgetti

Within the framework of the ITINERIS project, funded by the NextGenerationEU Programme (2022–2025), activities focus on the downstream effects of climate and environmental change. The Downstream Virtual Research Environment (VRE) supports the use of Research Infrastructures by providing tools for data visualization, analysis, and sharing, and is hosted on the D4Science infrastructure, where dedicated marine and land domain toolboxes have been developed (Assante et al.,2019, 2021).

The marine domain toolbox leverages available datasets to create an integrated dataset of temperature, salinity, pH, and CO₂ for the Gulf of Trieste (Italy), with particular emphasis on data from the National Institute of Oceanography and Applied Geophysics (OGS) over the last ten years, used as a representative use case. After data harvesting, validation, quality control, and merging are carried out using ERDDAP Navigator, a web application that enables data visualization, quality flag assignment, analysis, and integration. The resulting integrated dataset is then employed to compute climate change indicators, including ocean acidification and ocean carbon cycle budgets.

The land domain toolbox is designed to analyse areas affected by hydrogeological hazards, with a specific focus on landslide processes. In this context, a GeoServer and a GeoNetwork have been implemented to host regional-scale maps. At the local scale, several monitoring systems—including interferometric radar, GPS, extensometers, inclinometers, a video camera, and a data coordinator—have been installed for identifing potential ground instabilities. The monitoring instruments provide geospatial data from which time-series datasets are derived. All data products are documented and downloadable, and a dedicated web application supports time-series visualization and analysis.

Acknowledgements

The work has been funded by EU - Next Generation EU Mission 4 “Education and Research” - Component 2: “From research to business” - Investment 3.1: “Fund for the realisation of an integrated system of research and innovation infrastructures” - Project IR0000032 – ITINERIS - Italian Integrated Environmental Research Infrastructures System - CUP B53C22002150006.

The authors acknowledge the Research Infrastructures participating in the ITINERIS project with their Italian nodes: ACTRIS, ANAEE, ATLaS, CeTRA, DANUBIUS, DISSCO, e-LTER, ECORD, EMPHASIS, EMSO ,EUFAR ,Euro-Argo, EuroFleets, Geoscience, IBISBA, ICOS, JERICO, LIFEWATCH, LNS, N/R Laura Bassi, SIOS, SMINO.

 

References

Assante, M., Candela, L., Castelli, D., Cirillo, R., Coro, G., Frosini, L., Lelii, L., Mangiacrapa, F., Pagano, P., Panichi, G., & Sinibaldi, F. (2019). Enacting open science by D4Science. Future Generation Computer Systems, 101, 555–563. https://doi.org/10.1016/j.future.2019.05.063https://doi.org/10.5281/ZENODO.10070443

Assante, M., Candela, L., & Pagano, P. (2021). Blue-Cloud D4.4 Blue Cloud VRE Common Facilities (Release 2). https://doi.org/10.5281/ZENODO.10070443

How to cite: Franceschini, R., Reyes Suarez, N. C., Altenburger, A., Rossi, G., and Giorgetti, A.: Land and Marine earth science applications within a Downstream Virtual Research Environment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9690, https://doi.org/10.5194/egusphere-egu26-9690, 2026.

EGU26-10490 | Orals | ESSI2.5

Blue-Cloud2026 project - Deploying Beacon data lakes for harmonizing ocean data access for Virtual Research Environments 

Tjerk Krijger, Peter Thijsse, Dick Schaap, Robin Kooyman, and Paul Weerheim

In order to provide users with fast and easy access to multidisciplinary data originating from large collections, MARIS has developed a software system called Beacon that can, on the fly with high performance, extract specific data based on the user’s request. This software has been customised and deployed in the Blue-Cloud2026 project and several other European projects and is designed to return one single harmonised file as output, regardless of whether the input contains different data types. Beacon is fully open-source (AGPLv3) available, allowing everyone to set-up their own Beacon ‘node’ to enhance the access to their data or use existing Beacon nodes from well-known data infrastructures such as Euro-Argo, ERA5 or the World Ocean Database for fast and easy access to harmonized data subsets. More technical details, example applications and general information on Beacon can be found on the website https://beacon.maris.nl/.

Within the context of Blue-Cloud2026, Beacon is deployed to provide access to harmonised subsets from Blue Data Infrastructures for the WorkBenches (WB) that aim to generate harmonised and validated data collections of Essential Ocean Variables (EOVs). To this end a set of monolithic Beacon nodes are set-up for relevant data collections such as the WOD, CMEMS Cora, Euro-Argo and more. These are made available on the D4Science e-infrastructure as part of the Blue-Cloud VRE, giving access to all users registered as Blue-Cloud users. 

Going one step further, the output from multiple monolithic Beacon instances are combined into one merged Beacon node for each WB. This merged node includes a structural mapping from each monolithic Beacon to the target Common Metadata Profile as defined by the WB teams. These mappings are used in the Beacon queries to retrieve and load contents ‘as-is’ from monolithic Beacon instances into the merged Beacon instances, giving a common structure for variables, units, values, quality flags, and common metadata profile fields. The structured metadata and data are supplemented by additional metadata data as available for each of the monolithic Beacon instances.

This presentation will cover an overview of the Blue-Cloud 2026 project and  developments of the merged Beacon nodes, explaining how it can practically serve as data lakes for many VRE applications and how it is extendable to other domains. By using examples from the WBs, the reduction in time and effort spent for the researchers to collect the data are highlighted.

How to cite: Krijger, T., Thijsse, P., Schaap, D., Kooyman, R., and Weerheim, P.: Blue-Cloud2026 project - Deploying Beacon data lakes for harmonizing ocean data access for Virtual Research Environments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10490, https://doi.org/10.5194/egusphere-egu26-10490, 2026.

EGU26-10574 | Posters on site | ESSI2.5

Beacon data lakes for federated, high-performance access to marine data in the Blue-Cloud2026 ecosystem 

Paul Weerheim, Peter Thijsse, Tjerk Krijger, Robin Kooyman, and Dick Schaap

The Horizon Europe Blue-Cloud 2026 project evolved the pilot Blue-Cloud infrastructure into an ecosystem supporting FAIR and open data and analytical services. This ecosystem is envisioned as a data and analytical component for EDITO and can serve as a blueprint for thematic EOSC instances and Research Infrastructures. Within this context, a concrete plan was developed for high-performance data subsetting capabilities across the Blue-Cloud Virtual Research Environment (VRE), enabling researchers and WorkBench developers to access harmonised and validated Essential Ocean Variables (EOVs) from heterogeneous sources.

To implement this, the project adopted the fully open-source (AGPLv3) Beacon technology developed by MARIS as the core software for deploying data lakes across the VRE. Beacon provides very fast and easy access to data subsets from large multidisciplinary collections, returning a single harmonised output file regardless of the source formats. Eight monolithic Beacon instances were deployed for major Blue Data Infrastructure (BDI) collections including the World Ocean Database, ERA5, Copernicus Marine CORA, Euro-Argo, and SeaDataNet. All instances were integrated with the D4Science federated AAI and complemented by dedicated Jupyter notebooks to support reproducible workflows.

Based on extensive testing with the WorkBench teams, two integrated Beacon instances have been developed, combining data from multiple monolithic nodes through Beacon’s federation capabilities. A common metadata profile was set-up in collaboration with the WorkBenches, to support semantic harmonisation across different data sources, using the NERC Vocabulary Service, semantic tools, and unit-conversions. These merged nodes demonstrate cross-infrastructure data integration, representing a big step toward a European-scale federated data ecosystem.

This presentation will demonstrate how Beacon enables integrated workflows across infrastructures, significantly reducing effort for both data providers and researchers. While widely used in Blue-Cloud, Beacon’s design is domain-agnostic, with ongoing applications in other European and national initiatives, illustrating its potential as an innovative data lake tool for federating infrastructures.

How to cite: Weerheim, P., Thijsse, P., Krijger, T., Kooyman, R., and Schaap, D.: Beacon data lakes for federated, high-performance access to marine data in the Blue-Cloud2026 ecosystem, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10574, https://doi.org/10.5194/egusphere-egu26-10574, 2026.

Research Infrastructures (RIs) are at the core of data-intensive and computation-driven science, yet they face growing challenges in managing complexity, scalability, interoperability, and the effective integration of Artificial Intelligence and Digital Twin technologies. This contribution presents two complementary examples of EU projects both led by EGI Foundation, that address these challenges from the perspective of RI needs: interTwin, which has delivered a prototype of a Digital Twin Engine (DTE) for science, and RI-SCALE, which is developing the next generation of scalable data exploitation capabilities for RIs.

As a first example, the recently completed interTwin project demonstrated how RIs can collaborate to co-design a common blueprint architecture and an open-source Digital Twin Engine supporting the integration of models, simulations, data streams, and AI components. The project worked closely with scientific communities and infrastructures to co-design interoperable components for orchestration, provenance, quality assessment, and federated access to compute and data resources. Through multiple scientific use cases, interTwin showed how RIs can improve reproducibility, automation, and cross-domain reuse of methods and services.

As a second, forward-looking example, the recently launched RI-SCALE project focuses on empowering Research Infrastructures with scalable, AI-driven Data Exploitation Platforms (DEPs). RI-SCALE aims to support RIs in transforming vast and heterogeneous data holdings into actionable scientific knowledge by combining advanced AI frameworks, federated computing, and trusted data lifecycle management. The project places strong emphasis on co-design with RI operators and user communities, ensuring that DEPs respond to concrete operational and scientific requirements. Planned developments include mechanism for data transfer and caching, AI model hub integration, data spaces integration, and the establishment of a competence center to support adoption, training, and long-term sustainability within the RIs. 

The experiences and plans discussed in this contribution highlight key success factors for RIs digital transformation.

How to cite: Manzi, A. and Tenhunen, V.: Advanced Platforms for Research Infrastructures: Lessons from interTwin and perspectives from RI-SCALE, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11024, https://doi.org/10.5194/egusphere-egu26-11024, 2026.

EGU26-11078 | Posters on site | ESSI2.5

A Maturity Model for Facilitating Virtual Lab's Co-Development 

Koen Greuell, Geerten Hengeveld, Spiros Koulouzis, Gabriel Pelouze, Quan Pan, and Zhiming Zhao

Modern research increasingly relies on complex data workflows, digital twins, and AI-driven models. While use-case-specific virtual labs within virtual research environments (VREs) facilitate making these computing-centric techniques FAIR (Findable, Accessible, Interoperable, and Reusable), the transition from a technical demonstrator to a sustainable, community-wide service remains challenging. Development often stalls due to misaligned incentives for tool maintenance and coordination gaps between domain scientists and software engineers.

To overcome these coordination challenges, we propose a Virtual Lab Maturity Model designed to guide the co-development process. This model provides a structured framework to assess and evolve virtual labs through defined technical and functional milestones. By identifying gaps in research asset integration early, the model ensures that scientific workflows remain technically sustainable and reproducible.

We demonstrate the application of this framework within the Notebook-as-a-Virtual-Research-Environment (NaaVRE). The framework is currently deployed across ecology-focused virtual labs, co-developed by domain specialists, scientific software engineers, and the DevOps engineers at LifeWatch ERIC. One application is the LTER-LIFE project, where the maturity model steers the development of digital twins for Dutch aquatic and terrestrial ecosystems. These virtual labs facilitate collaborative research; for example, a dedicated lab integrating the LANDIS-II forest landscape model enables researchers to configure and adapt simulations for site-specific scenarios.

The Virtual Lab Maturity Model facilitates a common language across disciplines and ensures alignment with FAIR principles. This systematic approach allows for the evolution of virtual labs from initial prototypes into collaborative platforms capable of supporting large-scale research. By formalizing the path to maturity, we provide a scalable roadmap for building digital infrastructure in the environmental sciences.

How to cite: Greuell, K., Hengeveld, G., Koulouzis, S., Pelouze, G., Pan, Q., and Zhao, Z.: A Maturity Model for Facilitating Virtual Lab's Co-Development, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11078, https://doi.org/10.5194/egusphere-egu26-11078, 2026.

EGU26-11246 | ECS | Orals | ESSI2.5

ACTRIS Virtual Research Environment – Examples of use and collaboration within ACTRIS-Norway and the ACTRIS Data Centre  

Lise Eder Murberg, Cathrine Lund Myhre, Markus Fiebig, Nikolaos Evangeliou, Michael Schulz, Camilla Weum Stjern, Claudio Dema, and Simo Tukiainen

The Aerosol, Clouds and Trace Gases Research Infrastructure (ACTRIS) is the European Research Infrastructure Consortium (ERIC) dedicated to short-lived atmospheric constituents and clouds, supporting fundamental research and excellence in Earth system observation. ACTRIS produces high-quality, integrated long-term datasets in the field of atmospheric sciences and provides services tailored for scientific and technological use, including access to instrumented observational platforms. To enhance the availability, usability, and scientific exploitation of these datasets across disciplines and user communities, the ACTRIS Data Centre (DC) develops a range of user-oriented services, among which the ACTRIS Virtual Research Environment (VRE) plays a central role. 

The ACTRIS VRE enables efficient discovery, access, and scientific analysis of long-term observational data from ACTRIS National Facilities as well as other ground based observational sites as e.g. EMEP, EARLINET, Cloudnet and GAW. It facilitates analyses such as calculation of climatologies, long-term trend assessments, and the combination of datasets within the ACTRIS domain. The VRE is developed in collaboration between the ACTRIS DC and the ACTRIS-Norway community and is designed to serve both data producers and data users, ranging from infrastructure operators to researchers and students, across a wide range of atmospheric research applications. 

This presentation demonstrates the use of the ACTRIS VRE through selected notebook-based examples of higher-level data analysis and highlights the collaborative scientific efforts underlying its development. Data access within the VRE is based on the ACTRIS metadata REST API. ACTRIS datasets are provided in CF-compliant NetCDF format and are accessible through both streaming services (OPeNDAP) and direct HTTPS download. This approach enables flexible, reproducible, and programmatic data use, supporting interoperability with commonly used analysis tools and workflows. 

In collaboration with the ACTRIS-Norway community, the VRE includes several examples combining datasets for long time series analysis, the exploration of climatologies, and the investigation of trends. Selected examples are presented and discussed, with particular focus on the combination of FLEXPART footprint products and black carbon source apportionment data, developed within the EU project ATMO-ACCESS, together with observed equivalent black carbon measurements at several ACTRIS National Facilities. Additional higher-level analysis examples include single scattering albedo (SSA), ultrafine particle number concentrations (UFPs), and PM₁ source-related metrics from wood burning and traffic. These examples highlight how ACTRIS data can be applied to both climate-relevant and air-quality-focused research questions. 

Beyond scientific analysis, the ACTRIS VRE also serves as a platform for education and capacity building. Introductory notebooks demonstrate programmatic access to data and metadata and illustrate best practices for scientific analysis. The VRE has been used in ACTRIS training courses, ACTRIS Week, ITINERIS training workshops, and dedicated events at NILU, including collaborations with EUMETSAT, highlighting its role as a reusable training and demonstration environment. Community contributions to the example library are encouraged through an open GitHub repository, fostering collaborative development and reuse. The ACTRIS Virtual Research Environment is openly accessible at https://data.actris.eu/vre.

How to cite: Murberg, L. E., Myhre, C. L., Fiebig, M., Evangeliou, N., Schulz, M., Stjern, C. W., Dema, C., and Tukiainen, S.: ACTRIS Virtual Research Environment – Examples of use and collaboration within ACTRIS-Norway and the ACTRIS Data Centre , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11246, https://doi.org/10.5194/egusphere-egu26-11246, 2026.

EGU26-11751 | Posters on site | ESSI2.5

ESA CCI Permafrost time series maps as Essential Climate Variable (ECV) products primarily derived from satellite measurements and visualized within the AWI O2A ( Observation to Analysis and Archive) framework 

Antonie Haas, Birgit Heim, Annett Bartsch, Andreas Walter, Roland Koppe, Peter Konopatzky, Mareike Wieczorek, Guido Grosse, Tazio Strozzi, Sebastian Westermann, Frank Martin Seifert, and Sonja Hänzelmann

We demonstrate the latest version of the visualization of permafrost-related map products in the context of ESA CCI+ Permafrost initiatives (Phase I and II, 2018-2021, 2023-2026). Already in ESA DUE GlobPermafrost project (2016-2018), a comprehensive range of remote sensing products was produced by the project committee and visualized as viewer in the AWI O2A (Observation to Analysis & Archive) infrastructure framework: North-south transects in the northern hemisphere with trends in Landsat multispectral indices (e.g. Tasseled Cap Brightness, Greenness and Wetness, and Normalized Difference Vegetation Index (NDVI)), Arctic land cover (e.g. shrub height and vegetation composition), lake ice grounding, InSAR-based land surface deformation and rock glacier velocities. The main products were the Global Permafrost Essential Climate Variables (ECVs), which were derived from a spatially distributed permafrost model driven by Land Surface Temperature and Snow Water Equivalent products. These Permafrost ECVs include mean annual ground temperature (MAGT) and active layer thickness (ALT) at pixel level, and additionally permafrost extent and probability (PFR).

In the context of the ESA CCI+ Permafrost project, time was incorporated as a significant climate-related variable into the products. This resulted in a time series spanning over twenty years. It comprises CCI+ Permafrost Circum-Arctic model output for MAGT, from the surface down to a depth of 10 meters, as well as PFR and ALT. All data products are available at yearly resolution, as well as the calculated averages of MAGT, PFR and ALT over the time series.

To make the products publicly visible, we created WebGIS projects using WebGIS technology within the O2A (Observation to Analyses and Archive) data workflow framework at AWI. This modular, scalable and highly automated spatial data infrastructure (SDI) has been developed and operated at AWI for over a decade. It has undergone continuous improvement and provides map services for geographic information system (GIS) clients and portals. The FAIR principles were implemented to address the increasing demand for research data and metadata that is discoverable, accessible and reusable. The ESA Permafrost WebGIS products were designed using GIS software and published as Web Map Services (WMS), an internationally standardised Open Geospatial Consortium (OGC) format using GIS server technology. Additionally, visualisations of raster and vector data products have been developed that are specific to the projects and adapted to their spatial scales and resolutions.

In addition to data products derived from remote sensing, the locations of WMO GCOS ground-monitoring networks belonging to the permafrost community, which are managed by the International Permafrost Association (IPA) and form part of the Global Terrestrial Network for Permafrost (GTN-P), were incorporated as a feature layer and updated on an ongoing basis. All data products have previously undergone registration with the Digital Object Identifier (doi), and have been published in the data archives PANGAEA or ESA CEDA.

How to cite: Haas, A., Heim, B., Bartsch, A., Walter, A., Koppe, R., Konopatzky, P., Wieczorek, M., Grosse, G., Strozzi, T., Westermann, S., Seifert, F. M., and Hänzelmann, S.: ESA CCI Permafrost time series maps as Essential Climate Variable (ECV) products primarily derived from satellite measurements and visualized within the AWI O2A ( Observation to Analysis and Archive) framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11751, https://doi.org/10.5194/egusphere-egu26-11751, 2026.

EGU26-12357 | ECS | Orals | ESSI2.5

Using artificial intelligence to automate and expedite the harmonization of environmental data 

Tyler Karns, Cedric Hagen, Krutika Deshpande, Michael SanClements, Christine Laney, Benjamin Ruddell, Henry Loescher, and Tyson Swetnam

Data harmonization–the process of unifying disparate datasets into compatible formats and comparable units–is critical for global environmental research but remains prohibitively time-consuming and expensive. While many global environmental datasets could be assembled from existing available data, potentially offering transformative insight in pressing environmental issues, the exhaustive efforts to harmonize data is currently unfeasible for most scientific funding cycles. For example, cross-network studies (such as those between the U.S. National Ecological Observatory Network (NEON), the European Integrated Carbon Observation System (ICOS), and the Australian Terrestrial Ecosystem Research Network (TERN)) requires weeks-to-years of manual schema mapping, unit conversions, alignment, quality flag standardization for even a small number of data products, and more effort needed before any analyses can begin. Here, we present a large language model (LLM)-based agentic system designed to automate many of these data harmonization steps by leveraging semantic understanding of scientific metadata and documentation. This system is designed to ingest raw datasets and metadata, interpret variable semantics within scientific contexts, and generate tailored transformation pipelines. We tune this approach using a subset of previously manually harmonized environmental data from NEON, ICOS, and TERN, as well as the South African Environmental Observation Network (SAEON) and the Integrated European Long-Term Ecosystem, Critical Zone and Socio-Ecological Research Infrastructure (eLTER), as part of an effort by the Global Ecosystem Research Infrastructure (GERI) to build globally harmonized ecological drought datasets. Using these harmonized ecological drought datasets from across the globe, we test the efficacy of this LLM-based agentic system measuring accuracy, time/labor efficiencies, and data integrity preservation as compared to manual data harmonization workflows. Pressing global environmental challenges require rapid synthesis of global environmental data. By reducing data harmonization time from months to hours, these artificial intelligence (AI) tools will enable scientists to focus on analysis and modeling rather than data wrangling, ultimately accelerating research in these critical areas of global environmental science.

How to cite: Karns, T., Hagen, C., Deshpande, K., SanClements, M., Laney, C., Ruddell, B., Loescher, H., and Swetnam, T.: Using artificial intelligence to automate and expedite the harmonization of environmental data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12357, https://doi.org/10.5194/egusphere-egu26-12357, 2026.

EGU26-12890 | Posters on site | ESSI2.5

Identifying the Essentials: Distributed Resilience for Safeguarding Scientific Data in Times of Uncertainty 

Robert Huber, Kerstin Lehnert, and Jens Klump

The long-term sustainability of data repositories in the Earth and environmental sciences is increasingly influenced by evolving institutional priorities, fluctuating funding, and shifting governance frameworks. Recent events have highlighted the vulnerabilities in the continued accessibility of major US-based climate and environmental datasets, particularly in the context of political shifts, underscoring the fragility of even well-established infrastructures. Against this backdrop, we propose a multi-level, network-oriented model for strengthening the resilience of Earth and environmental data infrastructures. This model, in addition to enhancing the self-healing capabilities of individual repositories, aims to establish a common framework for cooperative stewardship.

Until recently, frameworks like Core Trust Seal and the Repository Crisis Scorecards, developed by the ESIP Repositories Resilience Project, have focused on risk assessment and mitigation and some resilience, but less on recovery. Resilience is conceptualised in our approach as the coordinated interaction of several layers that collectively enhance the “rescuability” of essential scientific data. It includes networks of mutual support, in which repositories proactively coordinate to prepare for and respond to operational crises, sharing responsibilities to reduce the risk of isolated failures; harmonized technologies and standards, common protocols, and training, enabling efficient creation of rescue-ready data packages; a structured validation and stress-testing framework to assess vulnerabilities using transparent, scenario-based criteria; and a contingency layer providing shared resources, such as temporary storage or hosting, deliberately reserved to support other repositories, and enabling distributed, peer-to-peer style replication workflows that allow data to remain accessible even when local systems cannot operate fully. A further component of this approach is the prioritisation of critical or at-risk datasets, ensuring that limited rescue capacity is directed toward collections whose loss would most severely affect research continuity and societal monitoring needs.

We illustrate this approach with examples from existing Earth and environmental science repositories, and argue that even small and mid-sized infrastructures can benefit from strategies that preserve core data and metadata, even if complete restoration of complex interfaces or ingestion pipelines might be impractical. Given the heterogeneity, scale, and long-term relevance of environmental data, developing tiered, distributed resilience strategies is essential for maintaining scientific continuity in an era of increasing systemic uncertainty.

How to cite: Huber, R., Lehnert, K., and Klump, J.: Identifying the Essentials: Distributed Resilience for Safeguarding Scientific Data in Times of Uncertainty, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12890, https://doi.org/10.5194/egusphere-egu26-12890, 2026.

EGU26-13138 | ECS | Orals | ESSI2.5

A Conversational Assistant for Geoscientists in Virtual Research Environments 

Biagio Peccerillo, Alfredo Oliviero, Marco Procaccini, Leonardo Candela, Luca Frosini, Francesco Mangiacrapa, Giancarlo Panichi, Massimiliano Assante, and Pasquale Pagano

D4Science provides web-based Virtual Research Environments (VREs) that support FAIR, open, and reproducible science across multiple research domains, including Earth science. These environments integrate data access, computation, and collaboration services, offering powerful capabilities to researchers and enabling complex, data-intensive scientific activities within a shared digital infrastructure.

This contribution introduces a conversational intelligent assistant integrated into D4Science VREs, designed to support Earth scientists in their research activity. The assistant provides a natural language interface that helps users interact with D4Science VREs' services, locate relevant datasets and research items, obtain guidance on common tasks, and support exploratory and operational activities within the VRE.

The assistant is designed with a modular approach. The user interacts with a coordinator agent that orchestrates a multi-agent system, where specialized AI agents collaborate to perform a variety of tasks. This architecture allows the assistant to handle heterogeneous requests and to support users across different phases of their research activities, while also facilitating maintenance and extensibility.

The conversational agent adopts a Retrieval-Augmented Generation (RAG) approach that leverages the knowledge already captured by the VRE through its regular use by research communities. In fact, as VREs naturally accumulate updated knowledge created and curated by researchers over time, the assistant's knowledge base evolves incorporating new information. This way, the assistant can ground its responses in domain-specific and up-to-date information, effectively acting as a domain-aware expert embedded within the research environment.

By serving as an accessible entry point to the VRE, the assistant complements existing interfaces without altering established workflows. The presentation discusses the motivation, design choices, and integration strategy. It also presents various concrete use cases relevant to Earth scientists, demonstrating how the conversational assistant can be effectively employed to support their research activity.

How to cite: Peccerillo, B., Oliviero, A., Procaccini, M., Candela, L., Frosini, L., Mangiacrapa, F., Panichi, G., Assante, M., and Pagano, P.: A Conversational Assistant for Geoscientists in Virtual Research Environments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13138, https://doi.org/10.5194/egusphere-egu26-13138, 2026.

EGU26-14822 | Posters on site | ESSI2.5

Supporting the Needs of Climate Change Risks Assessment Using Data from Several Research Infrastructures 

Christian Pagé, Dick M.A. Schaap, Niku Kivekäs, Tjerk Krijger, Zoé Garcia, Roel Vermeulen, Zimbo Boudewijns, Harry Vereecken, Christian Poppe Terán, Pablo Serret Ituarte, Karel Klem, Lucia Mona, Päivi Haapanala, and Janne Rinne

Effective adaptation to climate change requires a comprehensive understanding of climate-related risks, including their underlying drivers—hazards, exposure, and vulnerability—and their impacts on human, economic, and natural systems. The Integrated Research Infrastructure Services for Climate Change Risks (IRISCC) project brings together a consortium of leading and complementary research infrastructures spanning natural and social sciences and covering a wide range of domains and sectors. IRISCC integrates these capabilities through Service Design Labs, which apply co-design and transdisciplinary approaches, and through Service Demonstrators that benchmark and validate cross-infrastructure services.

The IRISCC Demonstrators are pilot projects designed to showcase the added value of combining data, tools, and expertise from multiple research infrastructures to create new services that are beyond the capacity of a single infrastructure to provide. By connecting existing environmental research infrastructures with the growing demand for actionable climate-risk knowledge, IRISCC aims to accelerate the development of integrated solutions for climate change risk assessment.

This presentation will illustrate how future climate data are being incorporated across all six Demonstrators, and how these datasets are combined with other research infrastructure resources to assess climate-related risks. Finally, we will introduce the Transnational and Virtual Access opportunities offered through IRISCC access calls, highlighting how researchers and stakeholders can access Europe’s climate-risk research facilities and services to engage with the IRISCC community.

This work was supported by the IRISCC project. IRISCC is funded by the European Union (Horizon Europe) under grant agreement No 101131261.

How to cite: Pagé, C., M.A. Schaap, D., Kivekäs, N., Krijger, T., Garcia, Z., Vermeulen, R., Boudewijns, Z., Vereecken, H., Poppe Terán, C., Serret Ituarte, P., Klem, K., Mona, L., Haapanala, P., and Rinne, J.: Supporting the Needs of Climate Change Risks Assessment Using Data from Several Research Infrastructures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14822, https://doi.org/10.5194/egusphere-egu26-14822, 2026.

EGU26-14863 | Orals | ESSI2.5

Breaking Data Siloes: How the NSF NCAR Geoscience Data Exchange Powers Collaboration 

Douglas Schuster, Harsha Hampapura, Riley Conroy, and Brian Bockelman

Data intensive research continues to drive innovation and discovery across Earth system science (ESS).  ESS datasets maintained in science discipline specific repositories, including climate model projections, historical reanalysis products, and observational datasets, provide rich resources to support these initiatives. While significant progress has been made through the hosting of datasets by commercial cloud providers, many of these data resources– sometimes stored in non-standard formats–are primarily maintained in unconnected, domain-focused data systems designed to support the legacy “download, clean and analyze model”. This is a time consuming process with bandwidth and storage requirements that may be prohibitive, particularly for institutions with limited resources. This combination of the download, clean, and analyze model, and the use of non-standard formats, combine to create a barrier to realizing the full research potential of ESS data assets. 


This presentation will highlight the National Science Foundation National Center for Atmospheric Research (NSF NCAR) efforts to develop and deploy its Geoscience Data Exchange, Research Data Commons (GDEX, https://gdex.ucar.edu). GDEX is designed to overcome the challenges described above by: 1) curating standards based (FAIR), Analysis and AI optimized (AR/AI) versions of global and regional atmospheric reanalysis outputs, earth systems simulation outputs, and observations produced at NSF NCAR and partner organizations, 2) providing direct access to these datasets through its integration with on-premise computational resources, and 3) providing performant distributed access through its integration the Open Science Data Federation’s (OSDF, https://osg-htc.org/services/osdf).  The OSDF supports streaming data access and integration with a variety of data and compute services through its system of geographically distributed data caches, including commercial cloud hosted open datasets. GDEX’s integration with OSDF supports a wider variety of cross-domain research use cases by enabling efficient access to the spectrum of datasets hosted through OSDF’s origin access points.  Finally, GDEX is integrated with colocated data analytics services to support rapid development and iteration of data science (e.g. AL/ML) workflows, and facilitate open sharing of those workflows. To promote user adoption of these services, an example set of reference data analysis workflows have been seeded in public collaboration software repositories and documented in JupyterBook style web pages.  GDEX users are encouraged to submit their own workflow examples through this resource, amplifying the impact of their science by allowing others to more easily build upon their work.

How to cite: Schuster, D., Hampapura, H., Conroy, R., and Bockelman, B.: Breaking Data Siloes: How the NSF NCAR Geoscience Data Exchange Powers Collaboration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14863, https://doi.org/10.5194/egusphere-egu26-14863, 2026.

EGU26-16239 | Orals | ESSI2.5

A collaborative international approach to data preservation and sustainability for the solid earth sciences.  

Tim Rawling, Lilli Freda, Rebecca Bendick, Elisabetta D’Anastasio, Helen Glaves, Rebecca Farrington, Federica Tanlongo, and Shelley Stall

Periods of natural disaster, political instability, and systemic disruption pose acute risks to the preservation, integrity, and accessibility of Earth science data. As research infrastructures and the datasets they curate become increasingly digital, interconnected, and critical to informed decision-making, safeguarding data against loss, politicisation, and fragmentation has emerged as a shared global responsibility. Here we will outline how Global Research Infrastructures (GRI’s), can contribute to a coordinated international response to data preservation during times of crisis.  We will draw on the work currently being done in an international collaboration between four national Earth science e-Infrastructures: AuScope (Australia), EPOS (European Plate Observing System), EarthScope (USA), and Earth Science New Zealand.

AuScope occupies a distinctive position within the global ecosystem of Earth science research infrastructures. As a nationally funded yet internationally connected infrastructure, AuScope combines trusted governance, mature data services, and a strong culture of open science across geophysics, geodesy, geochemistry, and geohazards. Through formal and informal partnerships with EPOS, EarthScope, and Earth Science New Zealand, AuScope is well placed to act as a node for resilient, distributed Earth and environmental science data stewardship.

We will discuss how GRI’s could collaboratively support: (1) distributed and redundant preservation of high-value Earth science datasets across jurisdictions; (2) continuity of standards, metadata, and persistent identifiers to ensure long-term usability of data even when originating institutions are disrupted; and (3) trusted custodianship arrangements that protect data integrity and provenance from external interference, institutional failure, hostile cyberattacks or adverse natural disasters. Such a networked approach will reduce single-point-of-failure risks and strengthen the resilience of the global Earth science data ecosystem.

AuScope’s local contribution currently includes providing geographically distinct replication capacity, harmonised metadata and FAIR-aligned services, and operational expertise in federated data platforms. Working with EPOS and EarthScope’s established thematic and domain services, and with Earth Science New Zealand’s regional leadership in hazard-focused data, this partnership can enable rapid “data rescue” responses, temporary custodianship during crises, and sustained access for displaced or affected research communities.

This collaboration demonstrates how globally networked research infrastructures can move beyond coordination to active mutual support in times of crisis. By leveraging complementary capabilities, shared standards, and trusted governance, a GRI for solid earth sciences can help ensure that critical Earth science data remain preserved, accessible, and scientifically reliable—regardless of natural, global or institutional instability—thereby supporting evidence-based decision-making and long-term societal resilience.

How to cite: Rawling, T., Freda, L., Bendick, R., D’Anastasio, E., Glaves, H., Farrington, R., Tanlongo, F., and Stall, S.: A collaborative international approach to data preservation and sustainability for the solid earth sciences. , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16239, https://doi.org/10.5194/egusphere-egu26-16239, 2026.

EGU26-18506 | ECS | Posters on site | ESSI2.5

Cross-Domain Virtual Laboratories on Blue-Cloud 2026: Shared Technologies and Platform Lessons Learned 

Cyrielle Delvenne, João Vitorino, Vânia Lima, Alexander Barth, Abel Dechenne, Steven Pint, Francesco Palermo, and Julien Barde

Blue-Cloud 2026 delivers a transversal suite of Virtual Laboratories (VLabs) implemented on a common Virtual Research Environment (VRE) in D4Science to enable end-to-end, FAIR-aligned marine data science across disciplines: from coastal physics and extremes to biogeochemistry, indicators, and fisheries. Rather than isolated demonstrators, the VLabs share reproducible platform patterns that standardize how users discover data, subset and authenticate to external providers, run analytics, and publish reproducible output. 

Across VLabs, a common interaction model combines (i) gateway-based identity and access, (ii) curated “VRE folders” that distribute ready-to-run notebooks and resources, and (iii) interactive web dashboards for parameter selection (space/time/variables), pre-flight checks (coverage and overlap), and guided execution. A shared technical backbone supports transparent data acquisition and processing: connection to federated repositories and services (e.g., Copernicus Marine, EMODnet, thematic nodes), automated subsetting, and workflow steps that harmonize formats, apply quality control, manage gaps, and generate analysis-ready datasets. Several VLabs implement the same methodological “building blocks” in different contexts: variational mapping/interpolation (DIVAnd) for gridded fields, model–observation fusion, and standardized production of map/time-series outputs (NetCDF plus figures/HTML).

A second cross-cutting layer is scalable computation. While notebooks remain central for transparency and education, compute-heavy workflows increasingly migrate to shared cloud analytics services (e.g. CCP Analytics Engine) - which include delegation of compute-intensive routines to optimized backend implementations - to (a) reduce local data dependencies through remote subsetting, (b) reuse cached intermediate products, (c) support larger spatio-temporal domains, and (d) generate interactive deliverables (e.g., Plotly dashboards), alongside archival outputs. This pattern is exemplified by indicator services (MHW, OHC, TRIX, SSIv2) but is transferable to other VLabs with large datasets or reproducible executions.

Transversal lessons learned include: (1) interoperability hinges on early harmonization (units, grids, metadata, vocabularies) and “best-practice” preprocessing embedded in the VRE; (2) user trust improves when workflows expose logs, provenance, and configuration exports for audit and reproducibility; (3) robust operations require resilience to upstream outages, authentication variability, and evolving toolchains; and (4) modular design (shared UI patterns, reproducible processing kernels, and standardized outputs) accelerates expansion to new regions, variables, and communities. Collectively, the Blue-Cloud 2026 VLabs demonstrate how a unified VRE can operationalize cross-domain marine analytics, translating distributed infrastructures into consistent user experiences and reproducible digital workflows.

How to cite: Delvenne, C., Vitorino, J., Lima, V., Barth, A., Dechenne, A., Pint, S., Palermo, F., and Barde, J.: Cross-Domain Virtual Laboratories on Blue-Cloud 2026: Shared Technologies and Platform Lessons Learned, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18506, https://doi.org/10.5194/egusphere-egu26-18506, 2026.

EGU26-18647 | Posters on site | ESSI2.5

Addressing the needs of Earth system field programs with a unified service catalog 

Vincent Douet, Marie Jossé, and Cécile Pertuisot

Data Terra, a French national research infrastructure for Earth system observations, brings together five disciplinary poles—FORMATER (solid earth), AERIS (atmosphere), ODATIS (ocean), THEIA (continental surfaces), and PNDB (biodiversity)—to provide access to diverse environmental data and associated services. Several of these hubs support field and observational programs, yet researchers often face uncertainty about the tools, services, and resources available to assist them before, during, and after their field program. This lack of visibility leads to fragmented practices and repetition of efforts across domains.

To address this challenge, we initiated an inter-pole collaboration to develop a catalog of services relevant to the needs of scientists for the entire life cycle of their field program. These services include software tools, applications, data management solutions, user support, technical assistance, and training resources. The goal is to provide a coherent and easily navigable overview of the resources offered by Data Terra and its collaborators, free from disciplinary boundaries.

This effort is built on a GeoNetwork-based catalog, expanding it to accommodate cross-domain field program needs. A new dedicated thesaurus has been created to classify the diverse resources, ensure consistent tagging, and allow an eased search and findability of the resources. By structuring needs and services in a shared semantic framework, we aim to enhance discoverability, foster interoperability between poles, and better support the scientific communities conducting environmental field programs.

How to cite: Douet, V., Jossé, M., and Pertuisot, C.: Addressing the needs of Earth system field programs with a unified service catalog, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18647, https://doi.org/10.5194/egusphere-egu26-18647, 2026.

EGU26-19457 | Posters on site | ESSI2.5

ICOOE, a Virtual Laboratory boosting the exploration and integration of coastal ocean observations along Europe 

João Vitorino, Vânia Lima, Juan Gabriel Fernández, Enrique Castrillo, Mélanie Juza, Nikos Zarokanellos, Jay Pearlman, René Garello, and Sigmund Kluckner

Coastal ocean areas are among the most complex and important marine regions of the World. Nowadays, a broad range of observations is collected in coastal ocean regions using many different systems. A panoply of numerical models is also being used to hindcast and forecast these areas. Much of this data is available to users through data aggregators and service providers, such as EMODnet or Copernicus Marine Systems.

The full potential of free access to this vast data pool is, however, frequently missed due to difficulties experienced by the users in handling the datasets, extracting the relevant information and combining different datasets in an integrated analysis. The ICOOE (Integration of Coastal Ocean Observations along Europe) VLab was developed and open to the community in the framework of the Blue Cloud 2026 project (EU Horizon Europe) to support the users in coping with these difficulties.

ICOOE proposes three complementary thematic services, providing a number of FAIR oriented tools and services that take full advantage of the Blue Cloud Virtual Research Environment and of globally accepted Ocean Best Practices and standards to explore key areas for the coastal ocean research and operational uses. The “Transboundary Transport and Connectivity” Thematic Service focus on the subinertial dynamics of coastal ocean areas. A dashboard environment allows the users to specify the geographical domain, time period and parameters of interest. The service identifies the available datasets for these choices and downloads and preprocess the datasets of interest. The user can then select a number of tools for exploration (e.g. basic statistics) or integration (e.g. pathways for transport) of the datasets. The pilot demonstrator of this thematic service accepts user domains located in the Iberian Margin global area and focuses on surface currents provided by HF radars and numerical models.

The “Extreme Events” Thematic Service explores the impacts of extreme storm events on the coastal ocean environment. Based on a user interface similar to the one describe above, this Thematic Service support users in the characterization of the conditions associated with 3 extreme storms that impacted the European coastal ocean areas, particularly their effects on the bottom sedimentary cover and on structures installed offshore.

The “Ocean Glider” Thematic Service aims to demonstrate the added value chain of glider missions from data acquisition to advanced products and visualizations for improved coastal information, integrating ocean state and variability derived from repeated glider transects Starting from input data provided by the user (raw data Slocum gliders or a OG1.0 standard dataset), the service offers a processing toolbox (based on Python Jupyter notebooks) designed to generate interpolated profiles on a regular grid along the glider monitoring line, based on the vertical and horizontal resolution of the raw data. It includes vertical sections of key parameters such as potential temperature, practical salinity, potential density, and geostrophic velocity. Additionally, an Advanced Data Viewer is used for enhanced data exploration and visualization.

This communication presents the basic capacities installed in the three thematic services implemented, providing use cases illustrating how they can support coastal ocean users.

How to cite: Vitorino, J., Lima, V., Fernández, J. G., Castrillo, E., Juza, M., Zarokanellos, N., Pearlman, J., Garello, R., and Kluckner, S.: ICOOE, a Virtual Laboratory boosting the exploration and integration of coastal ocean observations along Europe, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19457, https://doi.org/10.5194/egusphere-egu26-19457, 2026.

EGU26-19974 | Posters on site | ESSI2.5

The STAMPLATE-Schema as a unifying metadata language for FAIR and AI-ready environmental time-series data in the DataHub ecosystem 

Christof Lorenz, Nils Brinckmann, Claas Faber, Marc Hanisch, Roland Koppe, Ralf Kunkel, Ulrich Loup, Mihir Rambihia, Hylke van der Schaaf, and David Schäfer and the STAMPLATE-Team within the DataHub Initiative of the Research Field Earth and Environment

Environmental observations from sensor systems remain one of the most important sources of ground-truth data in Earth System Sciences. In particular, the rapid rise of AI-based methods, high-resolution modelling, and the growing demand for near-real-time reference data to support environmental decision-making, have substantially increased the need for reliable, interoperable, and AI-ready observational data.

To enable seamless integration and effective use of such data across diverse application scenarios (especially when combining observations from multiple sources), consistent data structures, well-defined interfaces, and harmonised, machine-readable metadata are essential. These requirements represent both a technical and a community-driven challenge and form a key prerequisite for ensuring the AI-readiness of sensor data.

Within the Helmholtz Research Field Earth & Environment, the DataHub initiative addresses this challenge by developing a uniform and FAIR research data infrastructure for observational time-series data across all seven contributing German Helmholtz Centres. Central to this infrastructure is the OGC SensorThings API STAMPLATE-Schema, a unified metadata schema for sensor-based observational data. The STAMPLATE-Schema serves as the semantic backbone of the DataHub ecosystem, providing a shared, machine-actionable language to describe deployments, sensors and observations. It is built upon JSON-LD and schema.org, enabling semantic interoperability, extensibility, and direct compatibility with web technologies and AI workflows.

The STAMPLATE-Schema connects and aligns the core ecosystem components, including the Sensor Management System (SMS) – which provides user-friendly management of sensor and deployment metadata - and the Earth Data Portal (EDP), which supports cataloguing, discovery, and visualisation of SensorThings API–based data. Additional integrations, such as the System for automated Quality Control (SaQC) and the time-series handling via time.io, build on this shared metadata foundation and support typical observational data workflows including data flagging, quality assessment, and downstream processing.

The STAMPLATE-Schema and the associated federated SensorThings API–based data infrastructures are currently being implemented across several major German research centres and large-scale observational projects, including the TERENO-network with its multiple observatories. Together, they are expected to provide access to more than 20 billion observations from seven research centres spanning multiple environmental research domains, including terrestrial, atmospheric, and marine systems, by the end of the year.

The DataHub and the STAMPLATE-Schema thus provide a common metadata language and framework for FAIR and AI-ready sensor data across our research field and similar federated research data infrastructures.

How to cite: Lorenz, C., Brinckmann, N., Faber, C., Hanisch, M., Koppe, R., Kunkel, R., Loup, U., Rambihia, M., van der Schaaf, H., and Schäfer, D. and the STAMPLATE-Team within the DataHub Initiative of the Research Field Earth and Environment: The STAMPLATE-Schema as a unifying metadata language for FAIR and AI-ready environmental time-series data in the DataHub ecosystem, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19974, https://doi.org/10.5194/egusphere-egu26-19974, 2026.

EGU26-21334 | ECS | Posters on site | ESSI2.5

Putting Science on the Map: Spatio-Temporal Metadata for Scientific Article Discovery 

Tom Niers and Daniel Nüst

For tackling the challenge of discovering scientific articles, researchers pursue several options: they conduct general web searches, explore generic academic databases like OpenAlex or Google Scholar, use a discipline-specific portal, task an AI agent, or consider personal recommendations, e.g., via social media. These options typically rely on one of the following approaches: search terms (title, abstracts, keywords, or full text), citation chaining, editorial curation (e.g., topical journals), or known authors and affiliations. However, as the number of publications is continuously rising, there is a need for additional methods that link scientific content in novel ways and help to find relevant works. An underused approach relies on the fact that almost all research has a spatial and temporal component, i.e. the “where” and “when” of a scientific article. How do I find a scientific article by exploiting its geographic context? Currently, if any spatio-temporal metadata can be found at all, it is likely to relate to the author's affiliation or the date of publication rather than the actual content of the research. The latter information is hidden, for example as place names or coordinates, in the full text, supplementary materials, visualisations, or data, but it is not available as human- and machine-readable metadata.

In this work, we present novel tools that are integrating spatio-temporal metadata into the scholarly publishing process: the geoMetadata plugin and the OPTIMAP. The geoMetadata plugin (https://github.com/TIBHannover/geoMetadata) provides authors and journal managers with straightforward tools, such as an interactive map, to collect valid spatio-temporal article metadata during the submission process in the widely used scholarly publishing platform, Open Journal Systems (OJS). The resulting metadata is published in a machine-readable format and articles are made discoverable on maps after publication. Building on this, OPTIMAP (https://github.com/GeoinformationSystems/optimap) demonstrates how scientific articles of several journals can be found via a single map view and published in one open API.

To realise the potential of spatio-temporal metadata fully, a large amount of existing literature needs to be enriched with trustworthy spatio-temporal metadata. We sketch a new framework to support the enrichment of scientific articles in the submission process and for already existing literature. First, various technologies will be evaluated: (i) Named Entity Recognition (NER), that leverage controlled gazetteers to extract place names and temporal expressions (ii) Optical Character Recognition (OCR) to recover spatio-temporal information from maps and figures and (iii) Large Language Models (LLMs) for full-document reasoning. In a second step, the framework will be applied in both an assistance mode (e.g., during the submission process) and a fully automatic mode (back catalogue of journals, publishers, conference series, etc.) for extracting spatio-temporal metadata. The extracted metadata could undergo different curation and validation steps and ultimately become available as part of a discipline-specific knowledge graph or generic academic databases. When such data exists on a large scale, one can explore an extension for scientific search portals, or improvements for handling spatio-temporal metadata throughout the whole research data management (RDM) cycle.

How to cite: Niers, T. and Nüst, D.: Putting Science on the Map: Spatio-Temporal Metadata for Scientific Article Discovery, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21334, https://doi.org/10.5194/egusphere-egu26-21334, 2026.

EGU26-22733 | Posters on site | ESSI2.5

DEEP Platform: Empowering Global Geoscientists in Data-Driven Research Era 

Linshu Hu, Yuanyuan Wang, and Zhenhong Du

Big Data and AI are reshaping Earth science discovery, yet deep-time geoscience still faces persistent barriers: scattered heterogeneous datasets, fragmented analysis tools, and limited end-to-end support for reproducible, collaborative workflows. These gaps make it difficult to harmonize data, knowledge, models, and computing across communities. We present DEEP (DDE Enabling and Empowering Platform), a one-stop online research platform under the Deep-time Digital Earth (DDE) program, providing a unified entry point (https://deep-time.org) to deep-time data, knowledge, models, and scalable computing services. DEEP is aimed at enabling and empowering global geoscientists’ collaborative innovation and discoveries by strengthening reproducibility across the research lifecycle under Open Science practices.

How to cite: Hu, L., Wang, Y., and Du, Z.: DEEP Platform: Empowering Global Geoscientists in Data-Driven Research Era, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22733, https://doi.org/10.5194/egusphere-egu26-22733, 2026.

EGU26-7341 | ECS | Posters on site | ESSI2.6

ATMO-Insights: An Interoperable Platform for IAGOS, ACTRIS and ICOS Data 

Julie Patuel, Damien Boulanger, Valérie Thouret, Bruno Nicol, Hannah Clark, Lise Eder Murberg, and Alex Vermeulen

Integrating data from multiple atmospheric research infrastructures remains a significant challenge for Earth System Science. Each infrastructure typically maintains its own data formats, access protocols, and processing workflows, creating barriers for scientists seeking to conduct cross-cutting analyses. The ATMO-Insights service addresses this challenge by providing unified access to long-term observational data from three major European research infrastructures: ACTRIS (Aerosol, Clouds and Trace Gases Research Infrastructure), IAGOS (In-service Aircraft for a Global Observing System), and ICOS (Integrated Carbon Observation System).

Developed within the H2020 ATMO-ACCESS project, the ATMO-Insights service offers a web-based graphical interface for interactive exploration. The platform provides seamless access to quality-controlled Level 2 data from ACTRIS and ICOS ground-based stations, as well as Level 3 aggregated vertical profiles and regional data products from IAGOS aircraft measurements. All datasets are published under Creative Commons Attribution 4.0 International license (CC BY 4.0), ensuring FAIR data principles.

The service implements a comprehensive workflow including dataset discovery through Essential Climate Variables (ECVs), interactive data filtering, and a suite of statistical analysis methods. Users can perform exploratory analysis (means, percentiles, moving averages), trend estimation (linear regression, Mann-Kendall test, Theil-Sen slope), and multivariate analysis (2D/3D scatter plots, linear regression) with customizable parameters. Results are visualized through interactive annotated plots.

The platform enables researchers to combine atmospheric observations across different measurement platforms and geographical locations. This service demonstrates how research infrastructure interoperability can be achieved through unified data access layers, common processing workflows, and harmonized analysis tools, facilitating cross-disciplinary scientific applications in climate and air quality research.

The web interface is accessible at https://services.iagos-data.fr/atmo-access/timeseries.

How to cite: Patuel, J., Boulanger, D., Thouret, V., Nicol, B., Clark, H., Eder Murberg, L., and Vermeulen, A.: ATMO-Insights: An Interoperable Platform for IAGOS, ACTRIS and ICOS Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7341, https://doi.org/10.5194/egusphere-egu26-7341, 2026.

EGU26-9855 | Orals | ESSI2.6

The EOSC Node Data Terra on the Earth system sciences: a way to consolidate a thematic data space 

Alessandro Rizzo, Erwan Bodéré, and Karim Ramage

The complexity of Earth, climate, environmental, and biological systems and processes, together with the significant improvement in multi-modal and multi-source data resolution and precision, implies that any scientific approach focusing on a specific area or dimension of the Earth system must increasingly integrate information and data from multiple fields of investigation. Today, it is crucial to apply multi- and interdisciplinary approaches that require easy access to qualified long-term data from other domains, as well as to data products that are easily usable by non-specialists. With this in mind, major challenges are linked to scientific knowledge of measurement data from spaceborne, airborne, and in-situ experiments, as well as from numerical models; uncertainties regarding future drivers of environmental transitions; and the effectiveness of sustainable measures in the context of evolving norms and values. These gaps and challenges particularly concern data quality and data veracity. The growing requirements for data in terms of timeliness of supply, availability across multiple spatial and temporal scales, length and stability of data records, and data product generation necessitate compliance with quality standards on the one hand. On the other hand, user support, documentation, and training materials are equally essential to ensure that data usage is truly effective, operational, and aligned with user needs. Further progress is required in terms of semantic and technical interoperability, particularly between climate, environmental, and socio-economic data. From a technical perspective, despite significant efforts already undertaken (e.g. OGC, INSPIRE), the current setup remains suboptimal in several respects.

Considering these assumptions, the recently established EOSC Node Data Terra, oriented toward Earth system scientific domains, aims to facilitate seamless access to high-quality, trusted, FAIR, and AI-ready multi-domain and multi-modal data for Earth, climate, environment, and biodiversity systems. This access is supported by rich metadata, semantic interoperability, and provenance information. The node also enables cross-domain data analysis workflows that are crucial for addressing emerging and urgent multidisciplinary research challenges related to global change, adaptation, extreme event characterisation, and societal impacts, while strengthening linkages with other data spaces and data hubs at European and global scales. Finally, through the implementation of a system-of-systems approach, the EOSC Node can support and participate to the consolidation of a thematic data space in close collaboration with other national and European environment-related infrastructures, fostering the linkages with European organisations and initiatives such as Destination Earth, AI Factories, and the HPC federation.

How to cite: Rizzo, A., Bodéré, E., and Ramage, K.: The EOSC Node Data Terra on the Earth system sciences: a way to consolidate a thematic data space, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9855, https://doi.org/10.5194/egusphere-egu26-9855, 2026.

EGU26-10237 | Orals | ESSI2.6

Blueprints for Semantic Integration and AI-Readiness (BITS 2.0) 

Andrea Lammert, Ivonne Anders, Anette Ganske, Sandra Geisler, Angelina Kraft, Claudia Martens, Hela Mehrtens, Emanuel Söding, Hannes Thiemann, Claus Weiland, and Alexander Wolodkin

While the goal for Earth System Sciences (ESS) is a seamless, machine-actionable and AI-ready data ecosystem, the reality is different.  Current infrastructures are often isolated into siloed "islands" and large data “continents” (Figure 1), separated by inconsistent technical standards and metadata conventions - like a "data archipelago". Despite major investments from consortia like NFDI4Earth in Germany and Data Terra in France, these fragments are only loosely connected by semantic bridges, leaving the ESS community with a landscape of scattered repositories rather than a unified digital environment.

To address this growing fragmentation of the ESS data landscape, the BITS 2.0 project aims to establish a Semantic Fabric for ESS. Instead of merely linking individual repositories, this approach overlays heterogeneous data holdings  with an intelligent, semantic shared layer. Building on the original BITS project, which successfully established a quality-controlled hub of ESS terminologies, BITS 2.0 will develop advanced, AI-powered data annotation services combined with a sustainable, community-driven governance model.

These tools will analyze and enrich diverse data assets - ranging from well-curated repositories and institutional data lakes to individual catalogues - with consistent, interoperable, and machine-actionable metadata. For researchers, this substantially lowers barriers to  data discovery, integration, and reuse across sources, enabling more efficient workflows and robust cross-domain analyzes. As an initial implementation, BITS 2.0 packages will be deployed on various types of data holdings. These include 'data continents', characterised by large, highly standardized data volumes serving multiple  use cases (DKRZ), and 'data islands', consisting of smaller, project-specific datasets with heterogeneous or inconsistent standardization (GEOMAR). Based on these developments, BITS 2.0 will develop  AI-empowered Blueprints 2.0 that provide a broadly transferable methodology for semantic integration across these scenarios (SGN, RWTH, TIB).

BITS 2.0 is envisioned as a trusted semantic enabler for the emerging hybrid ESS data space, providing the essential “semantic glue” required for meaningful interoperability. By transforming a fragmented infrastructure landscape into a coherent, searchable knowledge space, BITS 2.0 will support the combined use of larger and more diverse datasets to address complex Earth System research questions.

Figure 1: The scattered landscape of ESS data infrastructures, with varying challenges for semantic integration and AI-readiness depending on architectural design, depicted here as “data islands”, “data continents” and “data archipelagos”. 

How to cite: Lammert, A., Anders, I., Ganske, A., Geisler, S., Kraft, A., Martens, C., Mehrtens, H., Söding, E., Thiemann, H., Weiland, C., and Wolodkin, A.: Blueprints for Semantic Integration and AI-Readiness (BITS 2.0), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10237, https://doi.org/10.5194/egusphere-egu26-10237, 2026.

EGU26-10632 | ECS | Posters on site | ESSI2.6

FAIR Research Data Management for Complex Land-Atmosphere Observations & Modelling: The LAFI–GLAFO Approach 

Jonathan Minz, Astrid Ziemann, Thomas Schwitalla, Lisa Jach, Oliver Branch, Marcus Breil, Matthias Mauder, and Volker Wulfmeyer

The FAIR management, documentation, and publication of heterogeneous datasets remain key challenges in land–atmosphere (L–A) interaction research, particularly for data derived from complex three-dimensional observational systems and high-resolution modelling. A new generation of Global Energy and Water Exchanges (GEWEX) Land–Atmosphere Feedback Observatories (GLAFOs) is expected to routinely generate such data. The GLAFO prototype at the University of Hohenheim, Stuttgart, is already operational, producing advanced multi-sensor observations and high-resolution model outputs within the DFG Research Unit 5639, Land–Atmosphere Feedback Initiative (LAFI). Here, we present the research data management approach developed within LAFI to ensure FAIR-compliant handling of these complex datasets, enabling effective collaboration and accelerating scientific discovery.

Addressing LAFI’s scientific aim of closing key knowledge gaps in land–atmosphere (L-A) feedbacks that limit the accuracy of weather and climate simulations is critically dependent on robust research data management. In particular, effective data standardization, interoperability and reliable data access is required to support seamless collaboration across LAFI’s highly interdisciplinary and international community, spanning atmospheric, soil and agricultural sciences, hydrology, bio-geophysics, and neuro-informatics.

To address these requirements, ongoing activities focus on the standardization of diverse datasets to enable straightforward inter-comparison. In line with FAIR principles, LAFI datasets are being converted into Climate and Forecast (CF) convention–compliant NetCDF files and stored on a secure server hosted by the University of Hohenheim. Initially, all standardized data are freely accessible to LAFI researchers, with plans for broader public access via an API service and/or web portal. Associated data conversion and processing workflows, including Python scripts and documentation, are managed through GitLab.

In parallel, the research data management team collaborates with international initiatives such as obs4MIPs to enable the use of LAFI observations for climate model evaluation, including the development of protocols for advanced instrumentation such as Doppler, Raman, and water vapor differential absorption lidars. Beyond documenting best practices, current efforts emphasize the development of training and tutorial materials to support knowledge transfer to the wider community. These activities are aligned with broader initiatives within the German National Research Data Infrastructure (NFDI), including the NFDI4Earth service portfolio, to support FAIR-compliant dissemination across Earth system sciences.

We will present insights from ongoing research data management activities, discuss key challenges encountered, outline potential solutions, and share ideas for leveraging the potential of large-scale AI tools and generative AI. These experiences are intended to contribute constructively to improved Earth system understanding and modelling, broader discussions on research data management and shared challenges, and the development of harmonized guidance for effective scientific data stewardship through the European Open Science Council.

How to cite: Minz, J., Ziemann, A., Schwitalla, T., Jach, L., Branch, O., Breil, M., Mauder, M., and Wulfmeyer, V.: FAIR Research Data Management for Complex Land-Atmosphere Observations & Modelling: The LAFI–GLAFO Approach, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10632, https://doi.org/10.5194/egusphere-egu26-10632, 2026.

EGU26-10661 | Orals | ESSI2.6

Fostering Curiosity-Driven Research on the Solid Earth: the Geo-INQUIRE project 

Fabrice Cotton, Angelo Strollo, Helle Pedersen, Laurentiu Danciu, Florian Haslinger, Marc Urvois, Volker Rohling, Stefano Lorito, Andrey Babeyko, Daniele Bailo, Jan Michalek, Otto Lange, Javier Quinteros, Gaetano Festa, Shane Murphy, Majdański Majdański, Iris Christadler, Elif Türker, Stefanie Weege, and Mateus Litwin Prestes

 

Since 2022, researchers from 51 European institutions have been collaborating on Geo-INQUIRE, a multidisciplinary Horizon Europe project. This initiative aims to enhance, provide access to, and integrate key datasets, big data streams, and High-Performance Computing (HPC) tools critical for studying temporal variations in the solid Earth, forecasting multi-hazards, and analysing interactions between the solid Earth and its surrounding environments, including the ocean and atmosphere. 

Geo-INQUIRE seeks to overcome cross-domain barriers, particularly in the land–sea–atmosphere continuum, by leveraging cutting-edge data management techniques, advanced modelling and simulation methods, developments in AI and big data, and the extension of existing data infrastructures. The project focuses on disseminating these resources to the wider scientific community, aligning them with the European Open Science Cloud (EOSC) framework. Although many of these resources already exhibit a high level of maturity, Geo-INQUIRE have advanced them by improving availability, quality, and spatial and temporal resolution. The initiative emphasizes adherence to FAIR (Findable, Accessible, Interoperable, Reusable) principles, the adoption of open standards and licences, and the promotion of cross-disciplinary interoperability.  Integration of diverse datasets, including new observables, products, and services, is optimized through targeted activities in seven test beds. These test beds also host workshops and summer schools, providing hands-on training and engagement with project resources.

We highlight key scientific achievements, including participation by over 2,300 scientists in seminars and training activities and improved access to new datasets. We also examine new collaborative frameworks designed to increase diversity and encourage interdisciplinary research, and address the challenges of developing FAIR-compliant infrastructures adapted to machine-learning-driven science.

We finally discuss how National programmes could support alignment of national infrastructures with European-level integration to maximise the impact and sustainability of cross-domain data sharing and joint services. Experience from Geo-INQUIRE shows that sustained coordination mechanisms, shared access frameworks (e.g. Transnational Access), and targeted support for interoperability and training are essential for effective cross-domain integration and long-term community uptake, and could therefore be also embedded in national funding and governance schemes.

How to cite: Cotton, F., Strollo, A., Pedersen, H., Danciu, L., Haslinger, F., Urvois, M., Rohling, V., Lorito, S., Babeyko, A., Bailo, D., Michalek, J., Lange, O., Quinteros, J., Festa, G., Murphy, S., Majdański, M., Christadler, I., Türker, E., Weege, S., and Litwin Prestes, M.: Fostering Curiosity-Driven Research on the Solid Earth: the Geo-INQUIRE project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10661, https://doi.org/10.5194/egusphere-egu26-10661, 2026.

EGU26-11348 | Orals | ESSI2.6

LifeWatch ERIC as Catalyst and Connector: Scaling FAIR, AI-Ready Data Across Biodiversity, Ecosystems, and Climate Adaptation within EOSC 

Anne Fouilloux, Joaquín López Lerida, Antonio José Sáenz Albanés, Christos Arvanitidis, Lucia Vaira, and Zhiming Zhao

Earth System Science increasingly requires seamless integration across disciplines and domains. Leveraging AI and generative AI demands data that is interoperable in machine-actionable ways. Yet making data truly AI-ready, structured for automated discovery, integration, analysis and reuse, requires coordination across research infrastructures.

LifeWatch ERIC addresses this challenge by acting not as a standalone data repository, but as a catalyst for innovation and a connector across research infrastructures. In this role, LifeWatch ERIC provides analytical, semantic, and workflow-level bridges that enable data, services, and knowledge from infrastructures such as DiSSCo, eLTER, EMBRC-ERIC, AnaEE-ERIC and DANUBIUS ERIC, as well as global aggregators like GBIF, EMODnet and OBIS, to be combined into coherent, science- and policy-relevant networks. Concretely, LifeWatch ERIC provides a computational and semantic integration layer that turns distributed datasets and services into reusable workflows aligned with the EOSC Interoperability. We enable cross-RI composition through shared APIs, provenance-aware processing, and machine-readable descriptions of variables and methods, so that the same analytical logic can be executed across countries, domains, and observation systems.

This integrative role is realised through several ongoing initiatives: (a) within ENVRI-Hub NEXT, LifeWatch ERIC collaborates with Data Terra, ICOS, ACTRIS, and other environmental RIs to deliver interdisciplinary services through the emerging ENVRI EOSC thematic Node, directly addressing cross-compartment data integration for environmental research; (b) through FAIR2Adapt (coordinated by LifeWatch ERIC), we are developing a FAIRification Framework for creating FAIR Digital Objects,  demonstrated through six case studies spanning coastal ecosystem modelling in the Bay of Biscay, urban climate risk assessment in Hamburg, and national climate change adaptation hub development; (c) within EOSC Beyond, where LifeWatch ERIC is one of the 10 pilot nodes, we show how we can jointly support research communities thanks to the integration and interoperability between EOSC and Data Spaces by exploiting federating capabilities, and (d) through OSTrails, LifeWatch ERIC contributes to the design and piloting of end-to-end Plan–Track–Assess pathways, linking machine-actionable DMPs, Scientific Knowledge Graphs and FAIR assessment services, and demonstrating how environmental research infrastructures can operationalise FAIR-by-design workflows within EOSC.

We present concrete approaches to AI-readiness, grounded in existing research practice: Discrete Global Grid Systems (DGGS) forproviding analysis-ready, multi-resolution data structures that unify heterogeneous sources into AI-accessible formats; AI-assisted metadata population reducing manual curation burden; and semantic interoperability through I-ADOPT, structuring variables into machine-readable components that enable cross-dataset discovery regardless of naming conventions. Rather than positioning AI as an end in itself, these demonstrate how research infrastructures can jointly shape EOSC for transnational, cross-domain challenges. To support trustworthy AI applications, we capture data licensing, provenance, quality signals, and uncertainty as first-class, machine-actionable metadata, including transparent records of when generative AI has contributed to metadata enrichment and whether human validation has been applied. This ensures that automation accelerates curation without weakening scientific accountability.

How to cite: Fouilloux, A., López Lerida, J., Sáenz Albanés, A. J., Arvanitidis, C., Vaira, L., and Zhao, Z.: LifeWatch ERIC as Catalyst and Connector: Scaling FAIR, AI-Ready Data Across Biodiversity, Ecosystems, and Climate Adaptation within EOSC, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11348, https://doi.org/10.5194/egusphere-egu26-11348, 2026.

EGU26-11974 | Posters on site | ESSI2.6

Lost in Translation? How Terminologies Improve Data Discovery in Earth System Science 

Aenne Loehden, Claudia Martens, and Andrea Lammert

The volume of research data has been increasing rapidly for years, driven by technological developments and the growing recognition of data as a primary research output. At the same time, many studies require substantial financial and technical investments, research data are often not reproducible due to transient environmental conditions or political constraints, and scientific questions increasingly span multiple disciplines. These challenges are particularly pronounced in Earth System Science (ESS), which combines high spatial and temporal resolution observations and simulations with strong societal relevance, given the global impact of climate change on virtually all aspects of everyday life. Together, these factors underscore the urgent need for improved interoperability and reusability of scientific information.

Despite ongoing efforts, though, discovering relevant research data in domain-specific repositories often still requires detailed knowledge of disciplinary conventions, terminology, and practices for documentation, communication, and recherche. As one scientist aptly stated: 'You have to know what you are looking for in order to find something useful.' Which poses a significant barrier to cross-disciplinary research and limits the reuse potential of existing data.

Building on the previous comics in this series, which introduced the role of ontologies and terminologies in improving (cross-disciplinary) search, the third comic focuses on concrete enhancements to data discovery at the World Data Center for Climate (WDCC). In continuation of last year’s work, additional features to facilitate search and discovery are being explored, again relying on the systematic use of terminologies provided through a terminology service.

Terminologies constitute a core element of the technical language used within scientific communities. Key characteristics include well-defined concerted technical terms with unambiguous identifiers, rich descriptions, and explicit relationships both within a single terminology and across different terminologies. Their use supports standardization while enabling interoperability between datasets originating from different domains, disciplines, and research communities.
Terminology services (TSs) sustainably provide access to such terminologies through graphical user interfaces (GUIs) and application programming interfaces (APIs). They offer centralised, up-to-date information on the terminologies and their terms, properties, and semantic relationships, and can be seamlessly integrated into data repositories and search infrastructures. By incorporating terminology-based search and exploration features, repositories such as WDCC can lower entry barriers for users, support semantic search, and ultimately improve the findability and reuse of research data.

Through the comic format, this contribution motivates and illustrates these concepts, demonstrating how terminologies and terminology services can support FAIR data principles in practice and how semantic technologies can bridge disciplinary boundaries in Earth System Science.

How to cite: Loehden, A., Martens, C., and Lammert, A.: Lost in Translation? How Terminologies Improve Data Discovery in Earth System Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11974, https://doi.org/10.5194/egusphere-egu26-11974, 2026.

EGU26-16351 | Orals | ESSI2.6

Interoperable Research Infrastructures for Earth System Science: Lessons from ICOS 

Hannele Laine, Alex Vermeulen, Leonard Rivier, Dario Papale, Richard Sanders, Jonathan Thiry, Ute Karstens, and Margareta Hellström

Advancing knowledge discovery in Earth System Science requires research infrastructures to provide not only high-quality and reusable data, but also interoperable, machine-actionable services that can be combined across domains and scaled for data-intensive and AI-driven research. The Integrated Carbon Observation System (ICOS) delivers harmonised, long-term observations of greenhouse gas concentrations and fluxes across atmosphere, land ecosystems and oceans, underpinned by common standards, persistent identifiers, rich metadata and open access services aligned with FAIR principles. These characteristics position ICOS as a mature contributor to the European Open Science Cloud (EOSC) and as a practical example of how domain-oriented research infrastructures can support cross-disciplinary and AI-ready science.


In this contribution, we reflect on ICOS experience in implementing EOSC-relevant recommendations and discuss how these can foster cross–research infrastructure interoperability and support large-scale AI applications. We highlight how EOSC thematic nodes can act as coordination and integration layers that align standards, workflows and services across multiple research infrastructures, lowering barriers for cross-domain discovery and reuse. Building on concrete Earth system use cases, we outline how ICOS could contribute to such a thematic node by providing interoperable data services, domain expertise and reference implementations for AI-ready workflows. We further identify remaining gaps in semantic alignment, service orchestration and scalable access, and formulate recommendations for strengthening EOSC as a federated ecosystem capable of supporting next-generation, data- and AI-driven Earth System Science.

How to cite: Laine, H., Vermeulen, A., Rivier, L., Papale, D., Sanders, R., Thiry, J., Karstens, U., and Hellström, M.: Interoperable Research Infrastructures for Earth System Science: Lessons from ICOS, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16351, https://doi.org/10.5194/egusphere-egu26-16351, 2026.

EGU26-17280 | Orals | ESSI2.6

The Real Bottleneck in Earth System Science Is Not Data but Federation 

Anca Hienola, Ulrich Bundke, Marta Gutierrez, Andreas Petzold, Delphine Dobler, and Federico Drago

European environmental Research Infrastructures (RIs) collectively produce some of the most valuable in-situ observations for Earth System Science. Yet, despite widespread adoption of FAIR principles, the ability to actually combine, reuse, and operationalise these data across infrastructures remains limited. Differences in mandates, standards, and access mechanisms continue to translate into practical barriers for transnational and cross-domain research, particularly for scientific questions that span atmosphere, ocean, land, biodiversity, and solid Earth processes.

This contribution argues that the bottleneck in Earth System Science is no longer data availability, but federation capability. Within the European Open Science Cloud (EOSC), the ENVRI Node positions itself as a response to this gap by acting as a thematic federation layer for in-situ environmental research infrastructures. Central to this approach is the ENVRI-hub, which provides a shared integration environment enabling coordinated discovery, access, and interoperability across multiple RIs without centralising control or diluting infrastructure mandates.

We present concrete examples where integrating services from multiple providers through the ENVRI-hub enables new federation-level products, such as cross-domain catalogues and semantic discovery services, that cannot be delivered by single infrastructures alone. These examples highlight how interoperability, rather than new data production, becomes the key enabler for scientific progress.

Using selected Earth System Science use cases, we deliberately expose where current infrastructure boundaries fail to meet research needs, including limitations in harmonising in-situ observations, aligning access policies, and supporting machine-actionable, AI-ready data. These gaps point to use cases that cannot be solved by incremental improvements within individual infrastructures, but require coordinated action across them.

The ENVRI Node is presented as a practical, and intentionally opinionated, experiment in how international research infrastructures can move beyond coexistence towards federation, raising the question of whether future Earth System Science can afford not to.

How to cite: Hienola, A., Bundke, U., Gutierrez, M., Petzold, A., Dobler, D., and Drago, F.: The Real Bottleneck in Earth System Science Is Not Data but Federation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17280, https://doi.org/10.5194/egusphere-egu26-17280, 2026.

EGU26-17782 | Posters on site | ESSI2.6

NFDI4Earth – building the future 

Valentina Protopopova-Kakar, Wolfgang zu Castell, Sören Lorenz, Andrea Lammert, and Hannes Thiemann

The NFDI4Earth, an Earth System Sciences (ESS) consortium within the German National Research Data Infrastructure (NFDI), aims to mature and expand service ecosystem by strengthening service integration, governance, and technical foundations. Central to this effort is Service Portfolio Management, which coordinates the integration of existing and new services with key national and European infrastructures such as Helmholtz Earth & Environment DataHub, NFDI and European Open Science Cloud (EOSC). A Technical Review Board is intended to enable participative and evidence-based decision-making by defining technical requirements, evaluating solutions, and scouting relevant infrastructures. In parallel, NFDI4Earth aims to enhance its core services with Artificial Intelligence (AI) and High-Performance Computing (HPC) capabilities to improve functionality and direct user interaction.

To ensure long-term sustainability, NFDI4Earth aims to establish a robust technical design and operational model for its service portfolio. This includes continuous improvement of the technical backbone, maintenance, further development of core services and embedding of research outcomes. An Agile Development Team is planned to implement service integration tasks and act as Service Stewards to embed services into active research projects. This structure ensures alignment with architectural standards, interoperability, and evolving user and infrastructure requirements.

User engagement and support are addressed through the expansion of the User Support Network (USN), which provides expertise in ESS data management and serves as a bridge between users and developers. The USN is planned to integrate AI-based assistance, including Large Language Model (LLM) -powered support bot, to improve responsiveness and usability while reducing routine support load. It also aims to play a key role in usability testing, feedback management, and collaboration with other NFDI consortia toward a federated, NFDI-wide support service.

Finally, innovation and advanced technology integration are envisioned to be conducted by the Experimental Tech Lab. This measure aims to develop AI-enabled services such as natural language-based data discovery, geospatial foundation models, and direct data interaction tools. It also seeks to integrate HPC resources and standardized workflows to support large-scale data processing and modeling. Together, these activities ensure that NFDI4Earth services remain cutting-edge, scalable, and well aligned with both national and European research data infrastructures.

How to cite: Protopopova-Kakar, V., zu Castell, W., Lorenz, S., Lammert, A., and Thiemann, H.: NFDI4Earth – building the future, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17782, https://doi.org/10.5194/egusphere-egu26-17782, 2026.

EGU26-18391 | Orals | ESSI2.6

ACTRIS enabling transnational access to atmospheric data for European Earth System Science 

Eija Juurola and the ACTRIS RI Committee and Experts

Advancing Earth System Science (ESS) requires seamless access to interoperable, high-quality observational data across domains, scales and national boundaries. The Aerosol, Clouds and Trace Gases Research Infrastructure (ACTRIS) contributes to this goal by providing harmonised atmospheric composition and cloud observations through a distributed European network of National Facilities, supported by central calibration, quality assurance and FAIR data and services.

ACTRIS Central Facilities aggregates ground-based in-situ and remote-sensing measurements of aerosols, clouds and reactive trace gases and delivers them through its Data Centre using standardised metadata, persistent identifiers and interoperable interfaces. These features enable transnational data reuse and facilitate integration with other environmental research infrastructures and modelling frameworks within the European Open Science Cloud (EOSC).

An important services is the ACTRIS Virtual Research Environment (VRE) and streaming of data. The ACTRIS VRE enables efficient discovery, access, and scientific analysis of long-term observational data from ACTRIS National Facilities as well as other ground based observational sites as e.g. EMEP, EARLINET, Cloudnet and GAW. It facilitates analyses such as calculation of climatologies, long-term trend assessments, and the combination of datasets within the ACTRIS domain. Another pilot service ready is the opportunity for machine-to-machine access and streaming of data.

How to cite: Juurola, E. and the ACTRIS RI Committee and Experts: ACTRIS enabling transnational access to atmospheric data for European Earth System Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18391, https://doi.org/10.5194/egusphere-egu26-18391, 2026.

EGU26-18653 | Posters on site | ESSI2.6

Significance and Future of Data Infrastructures for the Geochemical Research Community   

Gerhard Wörner, Marthe Klöcking, Kerstin Lehnert, and Kirsten Elger and the DIGIS-GEOROC Team

The GEOROC and PetDB databases provide peer-reviewed geochemical data on igneous rocks, minerals and related materials for >25 years to cover the full range of igneous compositions, mantle xenoliths and minerals. Combined they provide access to more than 48.2 million individual data values from around 27,000 publications through web applications for search, filtering and download. These comprehensive datasets support large-scale regional and global geochemical data-based research spanning traditional geochemical studies to data-driven and machine-learning approaches.

GEOROC’s holdings have reached over 40.8 million data values from more than 23,000 publications focusing on ocean islands, continents and subduction zones. The PetDB database complements with ca. 7.4 million data values for igneous and metamorphic samples of the ocean floor, ophiolites, mantle xenoliths, tephra, and arc rocks.

The DIGIS project is modernizing the GEOROC data infrastructure in alignment with FAIR principles by introducing a new API, improved web interface, and unified data model. Further, topical global collections of data are extracted into individual DOI-minted data sets that are regularly updated from the GEOROC data holdings. These compilations and additional author-contributed data sets with rich metadata are accessed through GFZ Data-Services. GEOROC has recently been reconnected with the updated GeoReM database on geochemical reference materials. PetDB is part of the EarthChem data services and the IEDA2 data facility. PetDB was migrated to a new architecture and a new, simplified search interface was released in 2025 to improve usability. EarthChem also offers repository services where researchers can publish and archive their data.

Based on close collaborations between PetDB and GEOROC, the EarthChem Portal has provided for nearly 20 years a central access point to the content of both databases, as well as several smaller databases. Today, nearly 50 million data values are accessible at the ECP.

While the EPOS data resources are strong on geophysical (and other types of) data, EPOS has lacked a systematic inclusion of geochemical data from rocks on the European continent. The data services that the geochemical research community provide on geochemical compositions of rocks minerals and ore deposits globally has the potential to become a strong contribution to the EPOS data platform. To this end, we offer collaboration with EPOS to provide access points for two types of geoscience data: curated geochemical data on rocks and minerals in a domain-specific data base and large compiled selected data sets on specific types of rocks and minerals and/or from specific geological or geographic settings in the DIGIS-GEOROC repository at GFZ-data services. 

This also requires further developments: Under the umbrella of OneGeochemistry and NFDI4Earth, DIGIS, EarthChem and other initiatives such as the Australian Geochemistry Network are developing authoritative vocabularies and metadata standards, as well as interoperability and integration across different global geochemical databases. Further, together we develop tools for data quality assessment for improved data usability. These advances also broaden the applicability of geochemical data beyond hardrock oriented research to fields such as environmental science, archaeology and geohealth, demonstrating how FAIR-aligned geochemical infrastructures enhance reproducible research in Earth System Science and interdisciplinary collaboration.

How to cite: Wörner, G., Klöcking, M., Lehnert, K., and Elger, K. and the DIGIS-GEOROC Team: Significance and Future of Data Infrastructures for the Geochemical Research Community  , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18653, https://doi.org/10.5194/egusphere-egu26-18653, 2026.

EGU26-19384 | Posters on site | ESSI2.6

Towards a FAIR Water Data Ecosystem: The OneWater FAIR Water Data Platform 

Anne Puissant, Sylvain Grellet, Isabelle Braud, Mario Adam, Fanny Arnaud, Hélène Bressan, Véronique Chaffard, Charly Coussot, Stéphane Debard, Jérôme Fozzani, Yvan Le Bras, Eric Lecaudé, Kenneth Maussang, Frédéric Moine, Stéphane Ollagnier, Hervé Squividiant, Joel Sudre, and Lucas Valarcher

The French national research and innovation program OneWater – Eau Bien Commun (2022–2032) addresses key scientific and societal challenges related to the protection and sustainable management of water as a common good. The program brings together a large number of interdisciplinary projects producing highly heterogeneous datasets, including in-situ sensor measurements, Earth Observation products, model outputs, samples, and social/citizen science data. These datasets are complemented by long-term observations from research infrastructures, environmental observatories, and national public monitoring services. However, a significant part of these data is not yet compliant with the FAIR principles (Findable, Accessible, Interoperable, Reusable), limiting their reuse and cross-disciplinary exploitation.

To address these challenges, the OneWater FAIR Water Data Platform, maintained by the DATA TERRA research infrastructure through its thematic Datahub on Continental Surfaces THEIA, aims to go beyond a traditional data catalogue by fostering a FAIR Water Data ecosystem based on international standards and semantic interoperability. The platform promotes the production of FAIR-compliant data by design, enabling efficient data sharing, integration, and reuse across scientific and operational communities.

On top of research data, the OneWater Data Platform also interfaces with national public policy data services and associated monitoring networks. At the international level, the initiative contributes to and benefits from the global FAIR Water community through collaborations with the OGC Hydrology Domain Working Group, WMO, UNEP, UNESCO-IGRAC, eLTER, TERENO, and the Water4All partnership, ensuring alignment with international best practices.

This contribution presents the OneWater FAIR approach, including: (i) the definition of a framework to achieve high FAIRness levels for water data by interpreting the FAIR principles in the context of existing standards and best practices (OGC, W3C, INSPIRE, RDA); (ii) the development of FAIR Implementation Profiles and FAIRness analysis templates applied to datasets from the French water community (research, public monitoring) including THEIA/OZCAR; and (iii) the design of a FAIR Data Platform architecture relying on state-of-the-art interoperability standards, open-source solutions, and recent FAIR Open Science prototyping initiatives; and (iv) the active support to help water observatories climb up the stairway to FAIR”.

How to cite: Puissant, A., Grellet, S., Braud, I., Adam, M., Arnaud, F., Bressan, H., Chaffard, V., Coussot, C., Debard, S., Fozzani, J., Le Bras, Y., Lecaudé, E., Maussang, K., Moine, F., Ollagnier, S., Squividiant, H., Sudre, J., and Valarcher, L.: Towards a FAIR Water Data Ecosystem: The OneWater FAIR Water Data Platform, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19384, https://doi.org/10.5194/egusphere-egu26-19384, 2026.

EGU26-19486 | Posters on site | ESSI2.6

Creation of a Public Dataspace for Earth System Data at Jülich Supercomputing Centre 

Carsten Hinz, Sabine Grießbach, Lars Hoffmann, Enxhi Kreshpa, Kameswar Modali, Karsten Peters-von Gehlen, Konstantin Rushchanskii, Rajveer Saini, Olaf Stein, and Martin Schultz

Jülich Supercomputing Centre (JSC) is forging a public dataspace for Earth system data. Data will be made available on both storage clusters at JSC, ExaStore and Jülich Storage Cluster (JUST), which provide petabyte-scale storage to the exascale system JUPITER and our pre-exascale systems, respectively. We provide insights in the ongoing implementation of new services for the data management as well as the selected tools for data access. This also covers the creation of a metadata catalog based on the SpatioTemporal Asset Catalog (STAC) specifications.

Background:

Improvements in computational speed lead to better simulations in Earth System Modeling (ESM), by allowing them to resolve scales of a few kilometers. The volume of the resulting data greatly increases with the improvements in resolution and poses challenges for data processing and storage.

Currently a widespread use case gaining popularity in ESM is the training of machine learning (ML) models for weather and climate applications. They require fast access to datasets, which is supported by a special structure within the datasets with anemoi-zarr being a prominent file structure.

Numerical and ML applications demand an easy and FAIR access to datasets. The simplification of subsequent data processing and analysis requires access without the necessity to create individual local copies, either through shared storage or through access over the web.

JSC is a multipurpose high performance computing (HPC) center with ESM being a major user group. With Europe's first exascale system JUPITER, JSC has become the host for a second HPC infrastructure including the dedicated storage cluster ExaStore. ExaStore is designed to provide the high bandwidth, low latency and scalability required to efficiently support data-intensive workloads on JUPITER.

Jülich MeteoCloud is a central data repository for meteorological data on JUST, which is accessible from our pre-exascale systems, such as JUWELS and JURECA-DC. It covers a wide range of datasets, from reanalysis data to satellite observations with the total amount of data being currently about 4PB. With the extension to ExaStore we introduce a new branch for ML-ready datasets. The limited overall storage capacity at JSC calls for a reduction of data duplicates, in particular across project data spaces, and requires services for data movement and also staging of ML-ready datasets on demand.

Within the WarmWorld Easier project JSC and the German Climate Computing Center (DKRZ) co-develop and deploy services for data access. A core aspect is the findability of data, which is ensured with STAC. Each asset provides the necessary information to open the dataset described by the particular catalog entry in a specific way like, using file path when accessing from disk or URL for access through a web service.

With a combination of these approaches we will improve the infrastructure for Earth system sciences at JSC and provide reliable, low-latency access to stored datasets. As a first use case we will include ML-ready datasets for the WeatherGenerator project in the MeteoCloud.

How to cite: Hinz, C., Grießbach, S., Hoffmann, L., Kreshpa, E., Modali, K., Peters-von Gehlen, K., Rushchanskii, K., Saini, R., Stein, O., and Schultz, M.: Creation of a Public Dataspace for Earth System Data at Jülich Supercomputing Centre, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19486, https://doi.org/10.5194/egusphere-egu26-19486, 2026.

EGU26-19835 * | Orals | ESSI2.6 | Highlight

Strengthening Europe’s sovereignty and interoperability in Earth observation data 

Caroline Blanke, Frédéric Huynh, Wolfgang zu Castell, Jean-Philippe Malet, Sébastien Payan, and Thierry Bidot

Europe’s Earth and environmental observation landscape is increasingly structured around strong national initiatives that consolidate data assets, computing resources and services in close interaction with scientific communities. Initiatives such as Data Terra RI, ITINERIS Hub and NFDI4Earth illustrate how national investments organise complex ecosystems, combining EOSC nodes, advanced computing capacities and domain-specific services within coherent operational environments.

Building on these foundations, the focus goes beyond data interoperability towards designing infrastructures where data, computing capacities and services are articulated from the outset. Artificial intelligence, and in particular generative AI, acts as a key enabler by transforming FAIR data into machine-actionable resources that support cross-domain integration and operational workflows.

This evolution points towards a capability-oriented European ecosystem, where EOSC, Copernicus, Destination Earth, EuroHPC and AI Factories function as complementary layers enabling reuse and transnational collaboration. In this context, national initiatives serve as practical enablers of coherence, providing the conditions for trusted, scalable and sustainable services supporting science, public policy and societal needs.

The objective of this presentation is to highlight the Data Terra and NFDI4Earth vision on establishing strategic relationships within the EOSC, national RIs and European RIs at the benefit of the Earth and environmental science communities 

How to cite: Blanke, C., Huynh, F., zu Castell, W., Malet, J.-P., Payan, S., and Bidot, T.: Strengthening Europe’s sovereignty and interoperability in Earth observation data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19835, https://doi.org/10.5194/egusphere-egu26-19835, 2026.

EGU26-20309 | Posters on site | ESSI2.6

Building an AI-Ready National Dataspace for Exploiting Massive Fiber-Optic DAS Data 

Margaux Mouchené, Gwenaël Caër, Jean-Philippe Malet, Clément Hibert, Jérôme Detoc, Karim Ramage, Erwan Bodéré, Antoine Cunin, Emmanuel Chaljub, and Erwann Quimbert

Massive Fiber-Optic Distributed Acoustic Data (FO-DAS) streams pose major challenges in archiving, dissemination, and exploitation due to their extreme data rates, long acquisition durations, and high spatial-temporal resolution. Efficient storage is constrained by bandwidth, cost, and metadata standardization, while dissemination is limited by network capacity and interoperability. Scientific exploitation is further hindered by the need for scalable preprocessing, real-time analytics, and robust noise characterization to extract actionable signals from petabyte-scale, heterogeneous datasets.

This contribution targets the presentation of the DATA TERRA (FormaTerre, Odatis, THEIA) approach to describe, store, disseminate and exploit, through the GAIA-Data distributed data and computing infrastructure, massive FO-DAS datasets. Key infrastructure aspects are presented allowing to construct a national and AI-ready FO-DAS dataspace allowing easy and interactive exploitation of massive FO-DAS data for seismological source identification, event characterization and seismic parameter estimation generalizing across volcanoes, glaciers, fault zones, landslides, and urban areas,

FO-DAS bottlenecks are addressed via AI-driven compression (e.g. variational autoencoders), selective archiving, and data augmentation to ensure scalable monitoring. Integration of the dataspace in the DATA TERRA EOSC node will ensure interoperability with other national (NFDI4DEarth) and European research infrastructures (EPOS, EMSO, eLTER).

How to cite: Mouchené, M., Caër, G., Malet, J.-P., Hibert, C., Detoc, J., Ramage, K., Bodéré, E., Cunin, A., Chaljub, E., and Quimbert, E.: Building an AI-Ready National Dataspace for Exploiting Massive Fiber-Optic DAS Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20309, https://doi.org/10.5194/egusphere-egu26-20309, 2026.

EGU26-20997 | Orals | ESSI2.6

Bridging the fragmentation gap in Earth System Science: ITINERIS as a blueprint for national RI consolidation and integration 

Giuseppe Gargano, Rosa Maria Petracca Altieri, Simone Gagliardi, Lucia Saganeiti, Quinzia Palazzo, Lucia Mona, Claudio Dema, Ermann Ripepi, Michele Volini, and Carmela Cornacchia

Addressing complex Earth System Science challenges is currently hindered by a pervasive fragmentation of the research ecosystem. This disconnect extends beyond data dispersion to include siloed organizational structures and isolated disciplinary communities, limiting the potential for holistic environmental analysis and cross-domain innovation.
 
In response to the session's call for successful synergy examples, we present ITINERIS (Italian Integrated Environmental Research Infrastructures System). ITINERIS serves as a strategic operational model for shaping the Research Infrastructure (RI) landscape by integrating the Italian national nodes of 22 RIs across four critical domains: atmosphere, marine, terrestrial biosphere, and geosphere. This network encompasses ESFRI Landmarks (ACTRIS, EMSO, ICOS, Euro-Argo and LifeWatch), ESFRI Projects (e.g., e-LTER, DANUBIUS), EU RIs (e.g., ECORD), and key national RIs (e.g., the Laura Bassi research ship).
 
We demonstrate how ITINERIS contributes to the European Open Science Cloud (EOSC) through three core pillars:

•    The ITINERIS HUB: An integrated digital platform that transforms fragmented RI repositories into a unified discovery and analysis layer. By combining a centralized metadata catalogue (populated via automated harvesting) with thematic Virtual Research Environments and advanced access and training services, the HUB acts as a fundamental building block of the Italian EOSC Node, enabling advanced analyses, AI‑driven workflows and models across more than 500,000 environmental datasets, backed by a vast array of services and resources from diverse RIs
•    Cross-Disciplinary synergies: Innovative use cases that overcome domain boundaries, such as Nature-Based Solutions and climate mitigation. These scenarios demonstrate the power of combining atmospheric observations with marine and terrestrial ecosystem data, enabling a holistic assessment of environmental compartments that was previously unattainable due to infrastructure silos.
•    National Access Framework: A harmonized operational framework serving as a blueprint for future nationally-funded access programs. Building on the success of the ITINERIS-ACTRIS pilot call, which tested a unified governance for physical, remote, virtual access and hybryd access, this model provides a validated approach for reducing administrative barriers  and expanding access opportunities for the wider user communities.
 
We invite the community to explore ITINERIS as a replicable model for national aggregation strategies, discussing the governance and sustainability challenges of this multi-stakeholder initiative and sharing best practices for other national clusters. 
By aligning national strategies with European standards like FAIR and ENVRI-FAIR, ITINERIS provides a validated roadmap and a scalable template for 'joining forces' to build a unified, interoperable European environmental research landscape.

How to cite: Gargano, G., Petracca Altieri, R. M., Gagliardi, S., Saganeiti, L., Palazzo, Q., Mona, L., Dema, C., Ripepi, E., Volini, M., and Cornacchia, C.: Bridging the fragmentation gap in Earth System Science: ITINERIS as a blueprint for national RI consolidation and integration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20997, https://doi.org/10.5194/egusphere-egu26-20997, 2026.

EGU26-21056 | Posters on site | ESSI2.6

Co-design a FAIR data framework through an International data platform for a better tropical biodiversity forest management : the case study of the One Forest Vision initiative 

Olivier Norvez, Florence Palla, Anne Puissant, Yvan Le Bras, Laurent Durieux, Camille Lacroux, and Tiphaine Degoute

The One Forest Vision Initiative (OFVi) was introduced at the One Forest Summit held in Libreville in March 2023 and formalized within the Libreville Plan. It contributes to international negotiations on tropical forest conservation and aligns with the objectives of the Paris Agreement and the Kunming–Montreal Global Biodiversity Framework (COP15), notably the target to protect 30% of the Earth’s terrestrial and marine areas by 2030. OFVi is closely linked to the Country Packages (CPs) launched at the G7 Summit in Hiroshima in May 2023, which emerged from coordination between the Positive Conservation Partnerships proposed by France at COP27 and the Forest and Climate Leaders’ Partnership. Within this framework, OFVi provides structured scientific support to the research components of several CP signatory countries.

The initiative is led by six major French research organisations—CEA, CIRAD, CNRS, INRAE, IRD and MNHN—and coordinated by INRAE, CIRAD and IRD. It mobilises higher education institutions through joint research units and national research infrastructures.

OFVi aims to strengthen scientific capacities in tropical forest countries through cooperative research partnerships grounded in internationally recognised scientific standards. A core component of the initiative is the development of data access and processing services, including spatial and in situ observations, value-added products and open knowledge compliant with FAIR principles (Findable, Accessible, Interoperable and Reusable). Supported by shared data and expertise infrastructures, this approach ensures full national sovereignty over the entire lifecycle of scientific data related to forest conservation.

The initiative promotes interdisciplinary and transdisciplinary research approaches that integrate climate regulation, biodiversity conservation, water resources, and the rights and knowledge of Indigenous Peoples and local communities. Its scientific outputs are designed to directly support national conservation strategies implemented within the CP framework.

This contribution presents this original initiative supporting tropical forest conservation by generating and integrating distributed data infrastructures based on FAIR and CARE approaches, including : i) facilitates the transfer of reference datasets, ii) monitoring of environmental change and progress, iii) co-production of knowledge with local communities, iv) capacity building for data production and use in partner countries, v) and the promotion of best practices in data management and openness in line with FAIR principles. 

Finally, OFVi supports the development of national interdisciplinary data infrastructures for research and conservation, in close connection with the European Open Science Cloud (EOSC) ecosystem.

How to cite: Norvez, O., Palla, F., Puissant, A., Le Bras, Y., Durieux, L., Lacroux, C., and Degoute, T.: Co-design a FAIR data framework through an International data platform for a better tropical biodiversity forest management : the case study of the One Forest Vision initiative, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21056, https://doi.org/10.5194/egusphere-egu26-21056, 2026.

European Earth system science increasingly depends on seamless access to authoritative and interoperable spatial data across administrative and national borders. While many European research infrastructures focus on scientific data production, long-term sustainability and reproducibility also require the systematic inclusion of legally mandated public-sector geodata, which can complement initiatives such as the European Open Science Cloud (EOSC). This contribution presents how Germany’s Spatial Data Infrastructure (GDI-DE), through its national discovery and access portal Geoportal.de and its European coordination activities, supports this integration.

 

Geoportal.de is Germany’s central discovery and access portal for geospatial data and services provided by federal, state and local authorities. It implements a federated, legally grounded model in which hundreds of data providers publish INSPIRE-compliant metadata and services through standardized interfaces. This allows scientists to discover, evaluate and access authoritative reference data, environmental monitoring data and thematic geodata that are often not available through purely research-driven infrastructures, thereby providing a stable backbone for FAIR-aligned access to public-sector Earth system data in Germany.

Beyond national service provision, the German Coordination Office for Spatial Data Infrastructure actively contributes to European technical coordination through the MIG-T,  the permanent technical subgroup of the INSPIRE MIG. Within this framework, the coordination office of the GDI-DE coordinated the German stakeholder involvement in the recent INSPIRE consolidation process, which aimed at simplifying and modernizing the INSPIRE framework while ensuring continuity for operational infrastructures. This national coordination complemented parallel activities across Europe, helping to align national and European perspectives on the future of INSPIRE.

We argue that the combination of legally mandated national SDIs and European-level technical coordination is a key enabler for Earth system science, providing long-term, quality-controlled and interoperable access to public-sector geodata while allowing research infrastructures to focus on scientific value generation.

How to cite: Wenz, K.-P.: Bridging legally mandated national spatial data infrastructures with European Earth system science: the German Geoportal and transnational coordination, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22877, https://doi.org/10.5194/egusphere-egu26-22877, 2026.

EGU26-3821 | Orals | ESSI2.7

Reproducible and Scalable cloud-native EO data analysis using openEO  

Pratichhya Sharma, Hans Vanrompay, and Jeroen Dries

Earth Observation (EO) data plays a crucial role in research and applications related to environmental monitoring, enabling informed decision-making. However, the continuously increasing volume and diversity of EO data, distributed across multiple platforms and varying formats, pose challenges for easy access and the development of scalable and reproducible workflows.

openEO addresses these challenges by providing a community-driven, open standard for unified access to EO data and cloud-native processing capabilities. It supports researchers to develop interoperable, scalable and reproducible workflows that can be executed using various programming languages (Python, R or JavaScript).

openEO has become a cornerstone technology across major initiatives in agriculture, natural capital accounting, and land-cover monitoring. In ESA’s WorldCereal project, it provides the scalable framework needed to process global Sentinel-1 and Sentinel-2 time series and integrate advanced machine-learning models, enabling dynamic 10-meter cropland and crop-type maps. It also supports the Copernicus Global Land Cover service and its tropical forestry component by delivering consistent and repeatable processing chains for annual 10-meter land-cover products, which are crucial for policy reporting and SDG monitoring. Beyond land cover, openEO supports efforts like ESA's World Ecosystem Extent Dynamics project by creating reproducible ecosystem-extent mapping and change detection maps — key elements for biodiversity and environmental management.

Building on this foundation, the openEO Federation, now integrated within the Copernicus Data Space Ecosystem (CDSE), provides seamless access to distributed Earth observation data and processing resources through a single, unified interface. By connecting multiple backends, it removes the need to juggle separate accounts or APIs and enables cross-platform workflows over datasets hosted by platforms such as Terrascope and CDSE.

openEO also strongly supports FAIR (Findable, Accessible, Interoperable, Reusable) principles. It exposes rich metadata, relies on standardised processes, and encourages the use of reusable workflow definitions. This promotes transparency, reproducibility, and the sharing of algorithms and data across research and operational communities. The approach has been validated in several large-scale implementations, including ESA’s WorldCereal and the JRC’s Copernicus Global Land Cover and Tropical Forestry Mapping and Monitoring Service (LCFM), demonstrating its maturity for both research and production environments.

By enabling reusable, federated, and reproducible Earth observation workflows, openEO is helping to build a more interoperable and efficient computational ecosystem, one that supports scalable innovation, collaboration, and long-term operational monitoring. Therefore, in this session, we aim to spark discussion on how openEO enables federated, FAIR-compliant, and reproducible workflow approaches for large-scale Earth observation applications.

How to cite: Sharma, P., Vanrompay, H., and Dries, J.: Reproducible and Scalable cloud-native EO data analysis using openEO , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3821, https://doi.org/10.5194/egusphere-egu26-3821, 2026.

EGU26-5728 | ECS | Posters on site | ESSI2.7

Parallel HPC workflow orchestration with Nextflow, supported by CI/CD and containerization tools for global high resolution evaporation modelling 

Joppe Massant, Oscar Baez-Villanueva, Kwint Delbaere, Diego Fernandez Prieto, and Diego Miralles

The Global Land Evaporation Amsterdam Model (GLEAM) estimates daily land evaporation using a wide range Earth observation forcing datasets. In the project GLEAM-HR funded by the European Space Agency (ESA), we aim to create a global high-resolution daily evaporation dataset at 1 km for a period of eight years (2016–2023). To produce high-resolution evaporation estimates, all forcing data must be processed at 1 km resolution, requiring substantial computational resources. As the complete high-resolution forcing data no longer fits within the memory capacity of single HPC nodes, parallelization tools are necessary. To achieve this parallelization in a seamless way, a workflow orchestration ecosystem is designed that leverages the use of Zarr, Apptainer and Nextflow.

The Zarr ecosystem allows for easily writing to a dataset in parallel. Nextflow is an orchestration tool that allows dynamic job submissions, where the configuration of jobs can depend on the outcome of earlier jobs, such as the spatial domain to be processed. Apptainer is a containerization tool developed for HPC environments, allowing a “build once, deploy anywhere” approach. Combining these tools allows building a workflow orchestration environment that enables the automation of these parallel workflows while optimizing the job sizes for a given HPC environment.

The use of containers allows this workflow to be ported to different hardware without the need to set up all the environments again, making the designed workflow fully reproducible independent of the computing environment. Combining this with Continuous Integration and Continuous Delivery (CI/CD) practices to automate the container building and deployment, code development and workflow execution can be cleanly separated.

In a first test case, this processing workflow is used to produce global datasets of LAI, FPAR and vegetation cover fractions at 1 km resolution.  Future work focuses on the extension of this workflow to the other forcing datasets and the entire pipeline execution.

How to cite: Massant, J., Baez-Villanueva, O., Delbaere, K., Fernandez Prieto, D., and Miralles, D.: Parallel HPC workflow orchestration with Nextflow, supported by CI/CD and containerization tools for global high resolution evaporation modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5728, https://doi.org/10.5194/egusphere-egu26-5728, 2026.

EGU26-6238 | ECS | Posters on site | ESSI2.7

A prototype Open-Source data-processing pipeline to efficiently combine in-situ data with remote-sensing observations of the Earth 

Robert Reinecke, Annemarie Bäthge, David Noack, Matthias Zink, Simon Mischel, and Stephan Dietrich

In situ and remote sensing data are crucial in earth sciences, as they provide complementary perspectives on environmental phenomena. In situ data, collected directly from the Earth’s surface, offer high accuracy and detailed insights into local conditions, enabling precise measurements of variables such as soil moisture, temperature, and pollutant levels. Conversely, remote sensing data provides for extensive spatial coverage and the ability to monitor changes over time across vast areas, capturing large-scale patterns and trends that in situ data alone cannot reveal. By combining these two data sources and automatically preprocessing them into Analysis-Ready Data, researchers can enhance scientific insights, improve the robustness of machine learning applications, and refine models used to predict environmental changes or assess the impacts of human activity on natural systems. This integrated approach promotes a more comprehensive understanding of complex Earth processes, enabling better-informed decision-making and effective management strategies for sustainable development. However, preprocessing and combining in situ data from different sources can be highly complex, especially for global datasets. Joining this data with remotely sensed products may require substantial computational resources, given the increased number of observational records and high temporal resolutions. Here, we present a prototype of such a pipeline, CULTIVATE, an open-source data-processing pipeline that efficiently cleans in situ records and combines them with remote sensing data to create an automatically curated database. As new in situ data records are inserted, CULTIVATE updates only those records in the final database. In this presentation, we showcase CULTIVATE for over 200,000 global groundwater well observation time series that are merged with an extensive list of other time-series products, and we show how data curators can interact with the data processing pipeline. We further discuss how this prototype can serve as a blueprint for future architecture development for Research Data Infrastructures, how we can implement and enforce international standards, and how we can enable global datacenters to utilize automated data preparation in operational settings.

How to cite: Reinecke, R., Bäthge, A., Noack, D., Zink, M., Mischel, S., and Dietrich, S.: A prototype Open-Source data-processing pipeline to efficiently combine in-situ data with remote-sensing observations of the Earth, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6238, https://doi.org/10.5194/egusphere-egu26-6238, 2026.

EGU26-7115 | ECS | Posters on site | ESSI2.7

STeMP: Spatio-Temporal Modelling Protocol 

Jan Linnenbrink, Jakub Nowosad, Marvin Ludwig, and Hanna Meyer

Spatio-temporal predictive modelling is a key method in the geosciences. Often, machine-learning, which can be applied to complex, non-linear and interacting relationships, is preferred over classical (geo)statistical models. However, machine-learning models are often perceived as "black boxes", meaning that it is hard to understand their inner workings. Furthermore, there are several pitfalls associated with the application of machine-learning models in general, and spatio-temporal machine-learning models in particular. This might, e.g., concern the spatial autocorrelation inherent in spatial data, which complicates data splitting for model validation. 

Following from this, it is key to transparently report spatio-temporal models. Transparent reporting can facilitate interpreting, evaluating and reproducing spatio-temporal models, and can be used to determine their suitability for a specific research question. Standardized model protocols are particularly valuable in this context, as they document model parameters, decisions and assumptions. While such protocols exist for machine-learning models in general (e.g., Model Cards, REFORMs), as well as for specific domains like species distribution modelling (ODMAP), such protocols are lacking in the general field of spatio-temporal modelling. 

Here, we present ideas for STeMP (Spatio-Temporal Modelling Protocol), a protocol for spatio-temporal models that fills this gap. The protocol is designed to be beneficial for all parties involved in the modeling process, including model developers, maintainers, reviewers, and end-users. The protocol is implemented as a web application and is structured in three sections: Overview, Model and Prediction. The Overview section contains general metadata, while the following two sections go into more detail. The Model section includes modules describing, for example, the predictors, model validation procedures, and software. The optional Prediction section contains information about the prediction domain, map evaluation, and uncertainty assessment.

To make the protocol useful during model development, warnings are raised when common pitfalls are encountered (e.g., if an unsuitable cross-validation strategy is used). These warnings can be automatically retrieved from a filled protocol, spotlighting potential issues and helping authors and reviewers. Moreover, we provide the optional possibility to generate automated reports and also inspection figures from user-provided inputs (e.g., from model objects as well as from training and test data sets). The protocol is hosted on GitHub (https://github.com/LOEK-RS/STeMP) and hence open to flexible incorporation of feedback from the broader community.

With our presentation, we aim to encourage the discussion of our proposed model report in the spatio-temporal modelling community.

How to cite: Linnenbrink, J., Nowosad, J., Ludwig, M., and Meyer, H.: STeMP: Spatio-Temporal Modelling Protocol, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7115, https://doi.org/10.5194/egusphere-egu26-7115, 2026.

EGU26-9928 | Orals | ESSI2.7

Research data infrastructure evolution for handling km scale simulations of a warming world 

Kameswarrao Modali, Karsten Peters-von Gehlen, Fabian Wachsmann, Florian Ziemen, Carsten Hinz, Rajveer Saini, and Siddhant Tibrewal

With the advancement of technical capabilities, Earth System Models (ESM) are rapidly moving toward much higher spatial resolutions - down to kilometer scale - to better capture key processes and feedbacks needed for robust climate impact assessments. This growing model complexity places significant demands on data infrastructures, which must evolve to support widespread application of  high-resolution simulations.

This evolution is needed across various stages of the ESM simulation data life cycle, right from the choice of the variables that need to be part of the simulation output, the format of the output, residence period and transfer of the data across various active storage tiers and the final movement to the cold storage tier (tapes) for long time archival. Also tools to handle the discoverability of these data must be developed and implemented. The evolution of the infrastructure also must take hardware constraints into account and should ideally be in line with the FAIR principles.

As part of the Warm World Easier project, these developments were the adaptation of the model output to zarr, a cloud native format, the development of bespoke tools like ‘zarranalyzer’ to handle the movement of the data across storage tiers by creating tarballs suitable also for the tapes, creating reference files for these tarballs in parquet format to summarize the entire dataset and the inception of these into a metadata catalog following the SpatioTemporal Asset Catalog (STAC) standard. Finally, a virtual machine to host the STAC catalog with appropriate access rights for the data providers and data curators within the federated structure, as well as the end users, was set up. 

Applying this data handling concept to km-scale ESM data bridges the gap between infrastructures that produce flagship datasets and those that enable their efficient and reliable reuse by the community. For example, data generated at large, compute-focused HPC centers with limited storage could be transferred to partner centers that provide specialized data services for long-term access and reuse. 

Through the federated and seamless setup of the research data infrastructure, data handling matters are abstracted away from the data users. Hence, the developed setup provides an end to end solution, achieving the objective of providing the km scale ESM simulation output to a broader scientific community tackling the urgent societal problems arising due to a warming planet.

How to cite: Modali, K., Peters-von Gehlen, K., Wachsmann, F., Ziemen, F., Hinz, C., Saini, R., and Tibrewal, S.: Research data infrastructure evolution for handling km scale simulations of a warming world, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9928, https://doi.org/10.5194/egusphere-egu26-9928, 2026.

EGU26-11128 | Orals | ESSI2.7

Antflow: Simplifying Workflow Sharing and Execution for Digital Twins 

Nicolas Choplain and Gaudissart Vincent

Antflow is a next-generation orchestration and publication framework designed to streamline the operational deployment of Earth Observation (EO) processing workflows, particularly within Digital Twin environments. By automating the transformation of scientific code into interoperable, shareable, and scalable services, Antflow removes the traditional barriers between algorithm development and production-grade execution.

At its core, Antflow enables scientists and developers to publish complex workflows directly from their Git repositories, using OGC Earth Observation Application Packages (EOAP) as the workflow definition mechanism. These EOAP descriptions allow Antflow to instantly expose workflows as OGC API Processes services, enriched with dynamic user interfaces and STAC-compliant cataloguing of outputs. This ensures that every workflow - no matter how experimental or mature - can be discovered, reused, and integrated across Digital Twin platforms.

Antflow’s hybrid orchestration engine distributes tasks across heterogeneous computing environments, from HPC clusters to cloud-native nodes. Git-based lineage guarantees traceability and scientific integrity, while integrated multi-provider retrieval mechanisms (EODAG) simplify access to EO data sources.

A key strength of Antflow is its ability to generate interactive user interfaces automatically. These interfaces allow domain experts, integrators, and end-users to parameterize, run, and monitor workflows through clean, intuitive views.

Antflow is currently used across several projects (CNES Digital Twin Factory, OGC Open Science Persistent Demonstrator). It acts as a middleware layer that bridges algorithm design, operational integration, and stakeholder consumption. By standardizing workflow publication, ensuring reproducibility, and supporting scalable execution, it accelerates the deployment of modelling chains such as 3D environmental reconstruction, forecasting, and multi-sensor analysis workflows.

How to cite: Choplain, N. and Vincent, G.: Antflow: Simplifying Workflow Sharing and Execution for Digital Twins, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11128, https://doi.org/10.5194/egusphere-egu26-11128, 2026.

EGU26-11759 | ECS | Posters on site | ESSI2.7

Accelerating Earth System Workflows with In Situ Workflow Task Management 

Manuel Giménez de Castro Marciani, Mario Acosta, Gladys Utrera, Miguel Castrillo, and Mohamed Wahib

Modern experimentation with Earth System Models (ESMs) is accelerated by the employment of automated workflows to handle the multiple steps such as simulation execution, post-processing, and cleaning, all while being portable and tracking provenance. And when executing on shared HPC platforms, users usually face long queue times, which increase the time to solution. The community has proposed to aggregate workflow tasks into a single submission in order to save in queue time with promising results. But by doing this the workflow manager has to deal with the remote task execution that otherwise would have been done by the HPC scheduler.

Therefore, we propose to integrate two workflow managers to create a versatile and general solution for the execution of these aggregated workflows: one that orchestrates the workflow globally and another that is in charge of running tasks within an allocation, which we refer to as "in situ."

In this work, we performed a qualitative and quantitative comparison of three suitable and representative workflow and workload managers running in situ, HyperQueue, Flux, and PyCOMPSs, on three of the top 20 HPCs: Lumi, MareNostrum 5, and Fugaku. We evaluated the portability and setup, failure tolerance, programmability, and provenance tracking of each of the tools in the qualitative part. In the quantitative part, we measured total runtime, task runtime, CPU and memory usage, disk write, and node imbalance of workflows running a memory-bound, a CPU-bound, and an IO-intensive application.

Our initial results yield recommendations to the community as to which workflow manager to use in situ. HyperQueue's easy installation and portability makes it the best solution for non-x86 platforms. Flux had the easiest running setup due to its preparedness to run nested in Slurm. Finally, PyCOMPSs is the only tool out of the three to provide provenance tracking with RO-Crates.

How to cite: Giménez de Castro Marciani, M., Acosta, M., Utrera, G., Castrillo, M., and Wahib, M.: Accelerating Earth System Workflows with In Situ Workflow Task Management, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11759, https://doi.org/10.5194/egusphere-egu26-11759, 2026.

EGU26-12058 | ECS | Orals | ESSI2.7

Optimizing the Destination Earth Workflow with in situ HPC Task Orchestration 

Pablo Goitia, Manuel Giménez de Castro Marciani, and Miguel Castrillo

Traditionally, climate simulations are executed on High-Performance Computing (HPC) platforms, organized in workflows that involve all the steps for the complete execution of the model, data processing, and management tasks. With the sustained increase in the computing capacity of these machines over the years, the accuracy and resolution of climate simulations have reached levels never seen before.

In this context, the European Commission launched the Destination Earth initiative, aimed at developing a digital twin of the Earth for the adaptation to climate change. This initiative seeks to operationalize the running of very high-resolution climate simulations that are coupled with applications that consume their data as it is produced. In order to address the challenge of processing the hundreds of terabytes that each single simulation involves, the ClimateDT project implemented a data streaming approach. This means that any delay between the production time of the climate model data and the subsequent consumption by the post-processing applications results in a workflow misalignment, leading to unacceptable delays in the total execution time. This poses unprecedented challenges on the workflow management side.

One of the main causes of the misalignments that commonly occur lies in the long time that each of the many thousands of tasks of the workflow spends in the queues of the HPC job schedulers, such as Slurm. To address this issue, the community proposed to aggregate workflow tasks into a single submission to the HPC without altering their execution logic—a technique known as task aggregation. Previous studies have demonstrated the effectiveness of this approach for climate workflows, yielding promising results. However, the current implementation is limited, as the task execution within an allocation still relies on the workflow manager, which is not able to perform the fine-grained workflow orchestration that a dedicated tool could do in a convenient way.

To overcome this limitation, we propose in this work to integrate existing HPC software into the Autosubmit Workflow Manager to enable in situ orchestration of aggregated tasks, such as the renowned Flux Framework and Parsl. This integration aims to abstract both developers and users from the complexity of managing supercomputing resources, providing an easy-to-use interface. The proposed approach is validated using the Destination Earth workflow to enable more complex, structured forms of task aggregation while reducing queue times in large-scale simulations.

How to cite: Goitia, P., Giménez de Castro Marciani, M., and Castrillo, M.: Optimizing the Destination Earth Workflow with in situ HPC Task Orchestration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12058, https://doi.org/10.5194/egusphere-egu26-12058, 2026.

EGU26-14853 | ECS | Posters on site | ESSI2.7

AutoML: A Flexible and Scalable HPC Framework for Efficient Machine Learning in Atmospheric Modelling 

Isidre Mas Magre, Hervé Petetin, Alessio Melli, James Petticrew, Michael Orieux, Miguel Hortelano, Luiggi Tenorio, and David Mathas

The integration of Machine Learning (ML) into Earth System Sciences has revolutionized predictive modeling. However, the transition from local prototyping to large-scale deployment is often hindered by fragmented codebases and the manual overhead of managing complex hyperparameter tuning on High-Performance Computing (HPC) clusters. We present AutoML, a framework developed to automate and standardize the ML lifecycle in HPC environments by leveraging the open-source Autosubmit workflow manager.

AutoML employs a configuration-driven architecture that decouples model logic from workflow execution. By utilizing Autosubmit’s proven capability to handle complex dependencies and remote HPC environments, AutoML allows researchers to scale experiments—from initial prototyping to production-level global pipelines—through a single configuration file. This approach directly addresses the challenge of experiment reproducibility and efficiency within ML projects. The framework automates critical steps in the typical ML workflow, including hyperparameter search space optimization, multi-node distributed training, and dynamic resource allocation on heterogeneous HPC architectures.

We demonstrate the framework’s utility through Atmospheric Composition applications at the Barcelona Supercomputing Center (BSC). By providing a standardized structural template AutoML fosters collaboration and ensures that advancements in machine learning for atmospheric science are scalable, computationally efficient, and transferable across research lines.

How to cite: Mas Magre, I., Petetin, H., Melli, A., Petticrew, J., Orieux, M., Hortelano, M., Tenorio, L., and Mathas, D.: AutoML: A Flexible and Scalable HPC Framework for Efficient Machine Learning in Atmospheric Modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14853, https://doi.org/10.5194/egusphere-egu26-14853, 2026.

EGU26-15002 | Orals | ESSI2.7

Toward Federated Agentic Workflows for Numerical Weather Prediction With Chiltepin 

Christopher Harrop and Isidora Jankov

The development of efficient, scalable, and interoperable workflow management systems is critical for supporting reproducible research to drive the scientific advancement of earth system modeling capabilities. Many workflow systems targeted for earth system science have been developed to meet that challenge, each having similar capabilities as well as some unique strengths. However, the earth system modeling community now faces additional challenges that impose new requirements. The landscapes of both high performance computing (HPC) environments and numerical modeling are evolving rapidly. HPC systems are composed of a growing diversity of hardware architectures that may be hosted on-prem or by a variety of cloud vendors. Earth model system components are also increasing in diversity as research to augment or replace traditional physics based models with machine learning models progresses. Additionally, a growing diversity of end-users with varying levels of knowledge and expertise require agentic workflows that can respond to their requests. A consequence of this rapid growth in diversity is a growing need to run workflows that span multiple systems in order to optimize data locality and access to resources that maximize performance of specific model components. The availability of, and requirement for, diversity naturally leads to a requirement for federated workflows that effectively harness the computational power of a diverse set of resources distributed both geographically and across multiple administrative domains. In this presentation, we introduce and report our progress with the development of Chiltepin, the first known federated numerical weather prediction workflow system within the National Oceanic and Atmospheric Administration (NOAA). Chiltepin is designed to address key challenges in numerical modeling, particularly those related to sustainable progress in a changing NWP landscape characterized by increasing diversity of technologies and use of high-performance computing resources distributed across both geographical and administrative boundaries.

How to cite: Harrop, C. and Jankov, I.: Toward Federated Agentic Workflows for Numerical Weather Prediction With Chiltepin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15002, https://doi.org/10.5194/egusphere-egu26-15002, 2026.

EGU26-17077 | Posters on site | ESSI2.7

ARCA: A Scalable and Reproducible AI-Driven Workflow Platform for Climate Change and Natural Hazard Applications 

Maria Mirto, Marco De Carlo, Shahbaz Alvi, Shadi Danhash, Antonio Aloisio, and Paola Nassisi

Earth System Sciences (ESS) are increasingly characterized by large data volumes and high computational demands, which make complex analyses difficult to manage using ad hoc or manual solutions. This challenge is amplified when heterogeneous data sources, such as Internet of Things (IoT) infrastructures including wireless sensor networks, video cameras and drones, must be combined with high-performance computing (HPC) environments for climate modelling and advanced artificial intelligence (AI) algorithms.

The ARCA (Artificial Intelligence Platform to Prevent Climate Change and Natural Hazards) project, funded by the Interreg IPA ADRION Programme, was designed to respond to these challenges by providing a practical, workflow-based platform aimed at supporting climate change and natural hazard applications and, ultimately, reducing their impacts. The main objective of ARCA is to strengthen the cross-border operational capacity of stakeholders across the Adriatic–Ionian region, involving Italy, Croatia, Montenegro, Albania, Serbia and Greece. The platform supports the monitoring of forest ecosystems through AI-based tools, enabling continuous observation of forest areas and the prediction of multiple natural hazards, including droughts, wildfires and windstorms.

ARCA is built on a modular architecture centered on scientific workflows, which orchestrate multiple-type data ingestion, processing, analysis and AI model execution in a consistent and reproducible manner. The platform integrates big data technologies, workflow management systems and AI components, allowing complex processing chains to be automated while ensuring full traceability of data provenance, computational steps and model configurations. This approach supports FAIR principles and promotes the reuse of data and workflows across different applications and computing environments.

A key strength of ARCA lies in its ability to shield users from much of the underlying technical complexity, such as heterogeneous computing resources, access constraints and large data volumes, while still enabling scalable AI-driven analyses. As a result, researchers and practitioners can focus on scientific and operational questions related to climate impacts and hazard prevention rather than on low-level technical orchestration. In this contribution, we present the overall ARCA architecture together with selected use cases, illustrating how workflow-based approaches can effectively support scalable, transparent and reproducible ESS research in a multinational and federated context like the Adriatic–Ionian region.

How to cite: Mirto, M., De Carlo, M., Alvi, S., Danhash, S., Aloisio, A., and Nassisi, P.: ARCA: A Scalable and Reproducible AI-Driven Workflow Platform for Climate Change and Natural Hazard Applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17077, https://doi.org/10.5194/egusphere-egu26-17077, 2026.

EGU26-17974 | Orals | ESSI2.7

Multi-target process dispatch on the European Digital Twin of the Ocean  

stella valentina Paronuzzi ticco, Quentin Gaudel, Alain Arnaud, Jerome Gasperi, Mathis Bertin, and Victor Gaubin

The EDITO platform serves as the foundational framework for building the European Digital Twin of the Ocean. It seamlessly integrates oceanographic data and computational processes (non-interactive remote functions that take input and produce output) on a single platform that relies on both cloud and HPC (EuroHPC) resources. In this context, EDITO already provides many processes, such as OceanBench model evaluation and the ML-based GLONET 10-day forecast. To make scientists' work easier, we have developed a new way of generating processes on EDITO. We will use OceanBench evaluation as an example of a process that can be dispatched by the user on multiple targets, seamlessly handling the technical complexity of dealing with different hardware (cloud CPUs/GPUs, HPC, etc.). In our presentation we will explain how EDITO contributors will benefit from this new method of generating processes.   

How to cite: Paronuzzi ticco, S. V., Gaudel, Q., Arnaud, A., Gasperi, J., Bertin, M., and Gaubin, V.: Multi-target process dispatch on the European Digital Twin of the Ocean , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17974, https://doi.org/10.5194/egusphere-egu26-17974, 2026.

EGU26-18841 | ECS | Posters on site | ESSI2.7

Efficient large-scale data structuring to support Earth System Science analytics workflows 

Donatello Elia, Gabriele Tramonte, Cosimo Palazzo, Valentina Scardigno, and Paola Nassisi

The amount of data produced by Earth System Model (ESM) is continuously growing, driven by their higher resolution and complexity. Approaches for efficient data access, management, and analysis are, thus, needed now more than ever to tackle the challenges related to these large volumes. Moreover, data generated by ESM simulations could be organized in a way that is not the most effective for data analytics, slowing down scientists’ productivity. In this context, novel data formats and proper chunking strategies can significantly speed up access and processing of Earth system data and, in turn, the whole analysis workflow. 

In the scope of ESiWACE3 - Centre of Excellence in Simulation of Weather and Climate in Europe - we experimented the impact of different data formats and chunking configurations on high-performance data analytics operations/workflows. In particular, we evaluated performance of the well-known NetCDF format and the more recent cloud-native Zarr format, which is being increasingly used in Earth Science data analytics workflows and machine learning applications. Results show that the use of a proper data format and structure can noticeably reduce the time required for executing these analytics workflows, provided the structure is carefully tuned (e.g., chunking).

The work presents the main outcomes of such evaluation and how we are exploiting this knowledge to enhance Earth system data management workflows. In particular, the results achieved have contributed to enabling a more efficient access, delivery and analysis of large-scale data in CMCC’s tools and services, which are involved in different initiatives, including the ICSC - National Centre on High Performance Computing, Big Data and Quantum Computing.

How to cite: Elia, D., Tramonte, G., Palazzo, C., Scardigno, V., and Nassisi, P.: Efficient large-scale data structuring to support Earth System Science analytics workflows, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18841, https://doi.org/10.5194/egusphere-egu26-18841, 2026.

EGU26-19451 | ECS | Posters on site | ESSI2.7

Making Kilometer-Scale Earth System Model (ESM) simulations usable: A workflow approach from European Eddy RIch ESMs (EERIE) project. 

Chathurika Wickramage, Fabian Wachsmann, Jürgen Kröger, Rohith Ghosh, and Matthias Aengenheyster

Kilometer-scale global climate simulations are now generating petabytes of output at such a rapid pace that data production is surpassing data standardization. Central ESM infrastructures have traditionally followed a “data warehouse” approach: extensive preprocessing, quality control, and formatting are performed before users receive self-describing, FAIR-aligned files. While this delivers highly standardized and interoperable products, it also creates a growing bottleneck, computationally and organizationally, so that routine actions like checking variables, extracting a region and time slice, or comparing experiments can become slow, and hard to reproduce in practice. The EERIE project (https://eerie-project.eu/about/) is a clear example: its eddy-rich Earth System Models generate detailed and valuable output, but at a scale and pace that overwhelms traditional file-by-file workflows and delays usable access.

At DKRZ, we address this with an end-to-end workflow that transforms raw EERIE model output into analysis-ready datasets (ARD) that are easy to discover, subset, and analyze without requiring users to copy or download terabytes of files. The central element of this workflow is to create virtual Zarr datasets of the raw model output received from the modeling groups, by extracting chunk information and storing them in the kerchunk format with VirtualiZarr (https://virtualizarr.readthedocs.io/en/stable/index.html). These native-grid virtual datasets are published through both an intake catalog (https://github.com/eerie-project/intake_catalogues) and a STAC (SpatioTemporal Asset Catalog; https://discover.dkrz.de/external/stac2.cloud.dkrz.de/fastapi/collections/eerie?.language=en) interface, enabling users to examine variables, time period, regions etc., and retrieve only the subset they need while the bulk remains in place. Alongside native model-grid resolution, the data is also provided on a common ¼ degree regular grid to facilitate inter-model comparison.  Finally, we employ widely used standards and publish standardized products through established climate-data services (ESGF; https://esgf-metagrid.cloud.dkrz.de/search and WDCC; https://www.wdc-climate.de/ui/project?acronym=EERIE). We also aim to publish the processing scripts used throughout the pipeline, enabling others to build on the lessons learned from the EERIE approach.

How to cite: Wickramage, C., Wachsmann, F., Kröger, J., Ghosh, R., and Aengenheyster, M.: Making Kilometer-Scale Earth System Model (ESM) simulations usable: A workflow approach from European Eddy RIch ESMs (EERIE) project., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19451, https://doi.org/10.5194/egusphere-egu26-19451, 2026.

EGU26-20269 | Posters on site | ESSI2.7

Federated AI-Cubes: Towards Democratizing Big Earth Datacube Analytics 

Peter Baumann, Dimitar Misev, Bang Pham Huu, and Vlad Merticariu

Datacubes are an acknowledged cornerstone for analysis-ready Big Earth Data as they allow more intuitive, powerful services than zillions of "scenes". By abstracting from technical pains they offer two main advantages: for users, it gets more convenient; servers can dynamically optimize, orchestrate, and distribute processing.
We propose a combination of datacube service enhancements which we consider critical for making data exploitation more open to non-experts and more powerful, summarized as "Federated AI-Cubes": 

  • Location-transparent federation allows users and tools to perceive all datacube assets as a single dataspace, making distributed data fusion a commodity. Instrumental for this is automatic data homogenization performed at import and at query time, based on the open Coverage standards.
  • High-level datacube query languages, such as SQL/MDA and ISO/OGC WCPS, simplify analysis and open up data exploitation to non-programmers. Server-side optimization can automatically generate the individually best distributed workflow for every incoming query. At the same time, queries document workflows without low-level technical garbage, making them reproducible. 
  • The seamless integration of AI into datacube analytics plus AI-assisted query writing open up new opportunities for zero-coding exploitation. By not hardwiring a particular model a platform for easy-to-use model sharing emerges. Model Fencing, a new research direction, aims at enabling the server to estimate accuracy of ML model inference embedded in datacube queries. 
  • Standards-based interoperability allows users to remain in the comfort zone of their well-known clients, from map browsing over QGIS and ArcGIS up to openEO, R, and python frontends.
  • Cloud/edge integration opens up opportunities for seamless federation of data centers with moving data sources, such as satellites, including flexible onboard processing.

In summary, these capabilities together have potential for empowering non-experts and making experts more productive, ultimately democratizing Big Earth Data exploitation and widening Open Science.
In our talk, we discuss these techniques based on their implementation in the rasdaman Array DBMS, the pioneer datacube engine, which is operational on multi-Petabyte global assets contributed by research centers in Europe, USA, and Asia. We present challenges and results, supported by live demos many of which are public. Additionally, being editor of the OGC and ISO coverage standards suite, we provide an update on recent progress and future developments.
This research is being co-funded by the European Commission through EFRE projects FAIRgeo and SkyFed.

How to cite: Baumann, P., Misev, D., Pham Huu, B., and Merticariu, V.: Federated AI-Cubes: Towards Democratizing Big Earth Datacube Analytics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20269, https://doi.org/10.5194/egusphere-egu26-20269, 2026.

EGU26-21194 | ECS | Posters on site | ESSI2.7

PHENOMENA: a modular HPC model to facilitate automatic high-resolution greenhouse gas emission monitoring 

Carmen Piñero-Megías, Laura Herrero, Artur Viñas, Johanna Gehlen, Luca Rizza, Ivan Lombardich, Oliver Legarreta, Òscar Collado, Paula Camps, Aina Gaya-Àvila, Marc Guevara, Paula Castesana, and Carles Tena

This work presents the sPanisH EmissioN mOnitoring systeM for grEeNhouse gAses (PHENOMENA), a python-based, open-source, multiscale emission model that computes high resolution (up to 1km2 and daily) and low latency greenhouse gas (GHG) emissions for Spain. The system uses a bottom-up approach, based on emission factors and activity data, and consists of four different modules: First, the downloading module retrieves low latency activity data from multiple sources, including APIs, open data repositories, websites, and private providers, with error handling and automatic retrials to minimize manual intervention. Next, the preprocessing module standardizes the data and applies quality-control checks. The activity data is then combined with emission factors in the calculation module, which covers 11 emission sectors. Finally, the resulting emissions are post-processed to meet the requirements of an open web platform where the results are displayed.

PHENOMENA is based on the OOP paradigm and designed to run on High Performance Computing (HPC) infrastructures. While each one of the emission sectors can run in parallel using MPI strategies, it is still not feasible to run all of them at the same time or download all the activity data at once, as different data providers have different temporal availability. Thanks to the modularity of the system, it can be split into different HPC jobs to handle the heterogeneous data frequencies, increase robustness through automatic retrials, run different instances at the same time and automatize monthly uploads to the web portal, using the Autosubmit workflow manager.

The resulting product is a web app which provides daily 1 km x 1 km gridded emission maps and emission totals aggregated per region and sector. The system's latency is determined by the availability of the activity data from external providers, ranging from daily updates to delays of up to four months.

PHENOMENA allows monitoring low-latency GHG emissions for Spain at high temporal and spatial resolution, providing information in an accessible way to support national to local policymakers. The system is scalable, robust against failures, and easily adaptable to new data providers, regions and emission sectors.

How to cite: Piñero-Megías, C., Herrero, L., Viñas, A., Gehlen, J., Rizza, L., Lombardich, I., Legarreta, O., Collado, Ò., Camps, P., Gaya-Àvila, A., Guevara, M., Castesana, P., and Tena, C.: PHENOMENA: a modular HPC model to facilitate automatic high-resolution greenhouse gas emission monitoring, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21194, https://doi.org/10.5194/egusphere-egu26-21194, 2026.

The current era of Earth Observation (EO) is marked by an unprecedented increase in data volume and a growing number of satellite missions, driving a transition from dedicated processing infrastructure to cloud-native, distributed, and scalable orchestration. As Earth System Science, industry, and society increasingly rely on near-real-time EO data, efficient processing and workflow management have become critical components of modern ground segments. This presentation introduces an operational framework designed to meet the challenges of large-scale EO data processing. Examples from the Copernicus Sentinel programme and ESA’s Earth Explorer missions illustrate the framework’s scalable cloud deployment and operational performance. Common challenges - such as handling geospatial data formats, managing ground-segment anomalies, ensuring cybersecurity, providing standardized service interfaces, and leveraging public-cloud infrastructure - are addressed through a unified workflow approach. Operational experience from Copernicus payload data ground segment services, including monitoring via dashboards and control procedures, serves as a model for scientific missions and initiatives adopting these proven concepts. Scalability has emerged as a key feature, enabling efficient data transfers for the Copernicus Long-Term Archive, data access for Copernicus services, and higher-level processing workflows for scientific missions like BIOMASS. These orchestration strategies optimize resource use and energy efficiency for on-demand processing. The generic processing concepts demonstrated in the Copernicus and Earth Explorer programmes offer inspiration for new applications within the Earth System Science community, including hybrid approaches that integrate observations and simulation data.

How to cite: Hofmeister, R.:  A unified framework for large-scale, operational data processing in Earth Observation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21804, https://doi.org/10.5194/egusphere-egu26-21804, 2026.

EGU26-21909 | ECS | Orals | ESSI2.7

Workflow Modernization for Open and Scalable Access to Operational NWP Data 

Nina Burgdorfer, Christian Kanesan, Victoria Cherkas, Noemi Nellen, Carlos Osuna, Katrin Ehlert, and Oliver Fuhrer

Operational Numerical Weather Prediction (NWP) workflows are increasingly challenged by rapidly growing data volumes, expanding product diversity, and the need for timely and scalable access to model data. At the same time, modern Earth system services are evolving toward open data policies that require not only standardized access to model output for internal and external users, but also flexible mechanisms to extract and process relevant information in a FAIR (Findable, Accessible, Interoperable, and Reusable) manner. In this context, MeteoSwiss, in collaboration with the European Centre for Medium-Range Weather Forecasts (ECMWF), is developing a modernized data workflow to improve access to NWP model data for internal and external downstream users. 

The redesigned workflow shifts from a product-centric dissemination model toward a scalable data-as-a-service approach. Rather than relying on the generation and distribution of numerous predefined products, recent ICON forecast output is organized in the Field Database (FDB) and exposed through Polytope, which provides semantic data access and feature extraction capabilities. The workflow automates the ingestion, indexing, access control, and on-demand extraction of forecast fields, and integrates these steps into existing HPC-based production workflows and downstream processing pipelines. By replacing file-based product generation with database-backed access, the workflow enables deterministic data extraction, explicit provenance tracking, and consistent versioning of datasets, so that identical data requests can be reproduced reliably across time and environments. We present recent developments in Earthkit and Polytope that, for the first time, enable such automated workflows on the icosahedral grids used by ICON. Standardized interfaces and modern processing tools from the Earthkit Python ecosystem enable downstream users and applications to retrieve and process tailored subsets of NWP data on demand. 

Our use of open-source, community-developed software (FDB, Polytope, Earthkit) as core workflow components illustrates how ECMWF technologies can be integrated into national weather service environments. Operational experience gained in this context contributes to improving the maturity and usability of these tools and supports their broader adoption by other ECMWF Member States, facilitating the transfer of FAIR, workflow-based data access concepts across the weather and climate community. 

How to cite: Burgdorfer, N., Kanesan, C., Cherkas, V., Nellen, N., Osuna, C., Ehlert, K., and Fuhrer, O.: Workflow Modernization for Open and Scalable Access to Operational NWP Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21909, https://doi.org/10.5194/egusphere-egu26-21909, 2026.

EGU26-22151 | ECS | Posters on site | ESSI2.7

Earthscope Seafloor Geodesy Tools 

Franklyn Dunbar, Mike Gottlieb, Rachel Akie, and David Mencin

Earth System Science increasingly depends on scalable, reproducible computational workflows to manage complex data processing across heterogeneous environments and cloud infrastructure. In seafloor geodesy — a domain where high-resolution geodetic time series and acoustic ranging techniques are essential for understanding submarine tectonic and deformation processes — the need for robust, automated tooling is acute. We present Earthscope Seafloor Geodesy Tools, an open-source Python library developed by Earthscope consortium that supports preprocessing and GNSS-A processing workflows for seafloor geodesy data collected via autonomous wave glider platforms.
Earthscope Seafloor Geodesy Tools, provides modular utilities to translate, organize, validate, and prepare raw observational data for integration with GNSS-A positional solver inversion software (e.g., GARPOS), enabling reproducible, data pipelines within research and operational contexts. By encapsulating domain-specific processing steps into composable components, Earthscope Seafloor Geodesy Tools, enables workflow orchestration and large scale data processing across environments (i.e. local vs remote) and reproducibility of results.

How to cite: Dunbar, F., Gottlieb, M., Akie, R., and Mencin, D.: Earthscope Seafloor Geodesy Tools, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22151, https://doi.org/10.5194/egusphere-egu26-22151, 2026.

EGU26-624 | ECS | PICO | ESSI2.8

Accessible Recovery of Vintage Seismic Sections: From Paper to Migrated and Depth-Converted Data 

Alejandro Pertuz, Mª Isabel Benito, Pablo Suarez-Gonzalez, Pilar Llanes, and Martín García-Martín

In many regions, scanned images of vintage seismic reflection surveys constitute the only public subsurface information available, as the original data was frequently lost due to inadequate archival practices. Despite their age, these sections remain invaluable for geological studies; however, image files alone cannot be properly used in modern interpretation or processing software. To allow their scientific reuse, it is necessary to recover the original amplitude data through digitization: converting these images into the standard SEG-Y format. Current accessibility obstacles limit the widespread application of this process, as most existing tools are proprietary, unmaintained, or lack the precision required for advanced processing. Furthermore, digitization alone is insufficient; the full recovery workflow requires velocity model reconstruction and seismic post-stack processing to transform the digitized sections into data ready for interpretation.

To address these challenges, a new open-source digitization software and two complementary tools have been developed in Python: SEGYRecover, VelRecover and InSeis. SEGYRecover digitizes images of seismic sections by converting them into standard SEG-Y files. Built on established digitization methods, the program extracts the positive and negative amplitudes of individual traces plotted in variable-area wiggle displays and implements interpolation of clipped traces and smoothing of the final waveform. It also incorporates topography muting, trace mixing and gain correction to improve the visual consistency of the results. The resulting quality and fidelity of the digitized SEG-Y files are sufficient for advanced seismic analysis, including seismic attribute extraction, pseudo-relief generation, post-stack migration and integration with modern interpretation software. VelRecover enables the interpolation and visual editing of continuous velocity models from the sparse velocity analyses typically printed as headers alongside vintage seismic sections. These velocity models are suitable for depth conversion and post-stack migration. InSeis provides a graphical interface for building post-stack processing workflows using Seismic Unix through the Windows Subsystem for Linux, allowing users to apply deconvolution, post-stack migration, frequency filtering and other enhancements without command-line operations. The recovery of vintage seismic sections using this toolkit allows for the reinterpretation and detailed study of regions that was not possible with scanned sections alone.

Developed following FAIR principles and released under a GNU GPL v3.0 license, all three programs are freely available and fully documented via GitHub, Zenodo, and PyPI. This makes the digitization and processing of vintage seismic sections completely accessible to the global geoscience community.

How to cite: Pertuz, A., Benito, M. I., Suarez-Gonzalez, P., Llanes, P., and García-Martín, M.: Accessible Recovery of Vintage Seismic Sections: From Paper to Migrated and Depth-Converted Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-624, https://doi.org/10.5194/egusphere-egu26-624, 2026.

EGU26-1798 | ECS | PICO | ESSI2.8

Browzarr–Interactive Viewing and Inference of Multidimensional Datasets 

Jeran Poehls, Lazaro Alonso, and Nuno Carvalhais

The scale and complexity of multidimensional scientific data, particularly in Earth sciences, necessitate the distillation of all that information into a palatable visual form. This process is most efficient when visualizing and interacting with the data in its native higher dimensional form. Despite their inherent 3D and 4D structure, these data are frequently reduced to static 2D plots or animated sequences, obscuring critical spatial relationships, temporal dynamics, and emergent patterns.
3D or 4D visualization is constrained to standalone applications or niche GPU-powered libraries. These options provide powerful capabilities but require significant software installation, specialized workflows, and domain-specific expertise, creating a high barrier to entry that deters many researchers. 

We introduce Browzarr, an open-source framework designed to facilitate convenient multidimensional data exploration from any web connected device. With native support for Zarr and NetCDF, users can immediately dive into their data with no additional configuration, installs, or dependencies. A modular architecture and open-source design ensures adaptability to evolving research needs, enabling seamless integration with emerging data formats, analytical workflows, and user-driven extensions.

How to cite: Poehls, J., Alonso, L., and Carvalhais, N.: Browzarr–Interactive Viewing and Inference of Multidimensional Datasets, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1798, https://doi.org/10.5194/egusphere-egu26-1798, 2026.

EGU26-2544 | PICO | ESSI2.8

Integrated Workflow for Earthquake Stress Modelling and Seismicity Analysis Using Coulomb 4.0 and the ISC Earthquake Toolbox 

Kostas Leptokaropoulos, Shinji Toda, Tom Garth, Kaede Yoshizawa, Ross Stein, Ryan Gallacher, Volkan Sevilgen, and Jian Lin

We introduce an integrated workflow in MATLAB that combines Coulomb 4.0, a major revision of the widely used Coulomb, stress interaction and deformation application, with the ISC Earthquake Toolbox, which provides direct access to the International Seismological Centre (ISC) Bulletin. This interoperability enables researchers to seamlessly transition from global earthquake data acquisition to stress interaction analysis within a single environment.

The workflow begins by querying and importing earthquake catalogs from the ISC Bulletin using the toolbox’s GUI, allowing selection by time, region, depth and magnitude. These events can then be visualized in 3D and cross-section views, and their parametric data, including moment tensors, are used to define fault geometries in Coulomb 4.0. New Coulomb features, such as automatic fault parameter scaling from magnitude and interactive fault editing, streamline the setup of rupture planes based on ISC-reported events. Stress transfer calculations and deformation modelling can then be performed, with results displayed alongside seismicity overlays for comprehensive interpretation.

This combined approach enhances reproducibility and efficiency by eliminating manual data handling and enabling dynamic visualization of both seismicity and modelled stress/deformation changes. We demonstrate the workflow using recent seismic sequences, highlighting its potential for earthquake interaction studies, hazard assessment, and educational applications. By bridging global seismic data with advanced stress modelling, this interoperability represents a significant step toward integrated geoscience software ecosystems.

The ISC Earthquake Toolbox can be freely accessed from:

  • GitHub (https://github.com/tomgarth/ISC_Earthquake_Toolbox) and
  • File Exchange (https://www.mathworks.com/matlabcentral/fileexchange/167786-isc-earthquake-toolbox?s_tid=srchtitle)

Coulomb 4.0 can be freely accessed from:

  • GitHub (https://github.com/YoshKae/Coulomb_ver4) and
  • temblor.net/coulomb/.

How to cite: Leptokaropoulos, K., Toda, S., Garth, T., Yoshizawa, K., Stein, R., Gallacher, R., Sevilgen, V., and Lin, J.: Integrated Workflow for Earthquake Stress Modelling and Seismicity Analysis Using Coulomb 4.0 and the ISC Earthquake Toolbox, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2544, https://doi.org/10.5194/egusphere-egu26-2544, 2026.

EGU26-2634 | PICO | ESSI2.8

dMODELS: An Open-Source, Modular MATLAB Environment for Geodetic Deformation Analysis 

Maurizio Battaglia and Marco Bagnardi

Ground deformation can arise from tectonic and volcanic processes as well as from human activities, such as subsurface fluid withdrawal. Mathematical models describing crustal deformation in response to these processes are essential for characterizing driving mechanisms, constraining source location, size, orientation, and volume change. Models provide critical information for hazard forecasting and mitigation, assessing anthropogenic environmental impacts, land-use planning, and related applications.

In this context, analytical kinematic models remain essential tools for the rapid interpretation of deformation, particularly in operational and time-sensitive settings.

dMODELS is an open-source MATLAB environment designed primarily to model and interpret crustal deformation associated with volcanic activity and active fault systems by non-linear inversion of GNSS, InSAR, and tilt observations. The software consolidates a suite of analytical kinematic source models into a single, end-to-end framework that is modular (pre-processing → inversion → post-processing), consistent (standardized formulations across models), transparent (fully documented scripts with examples), and cross-platform (Windows and Linux). Although most analytical formulations originate from established literature, several equations have been verified, reformulated, standardized to ensure internal consistency, and validated against corresponding finite-element solutions.

The platform runs on Windows and Linux systems and is structured to support end-to-end modeling workflows, including: (a) preprocessing tools for data selection and formatting, (b) non-linear inversion routines for estimating source parameters and associated uncertainties, and (c) post-processing utilities for generating publication-ready figures. Each module is accompanied by examples and documentation, with a full user manual in release by the U.S. Geological Survey.

The deformation sources implemented in dMODELS are kinematic representations, including pressurized cavities (spherical, spheroidal, or penny-shaped) and planar dislocations embedded in a homogeneous, isotropic elastic half-space. These constructs do not represent physical reservoirs directly but approximate the stress and strain fields produced by real subsurface processes. As such, dMODELS allows users to constrain source geometry, location, volume change, and stress distribution, although total reservoir volume and fluid properties remain unresolved.

Despite the inherent simplifications of analytical models, their rigorous use, combined with high-quality geodetic datasets, provides powerful insights into active deformation sources and supports both research and monitoring applications. By making robust, reliable, and independently verified modeling tools readily accessible, dMODELS supports reproducible analyses and enables their use by a broader scientific and operational community.

How to cite: Battaglia, M. and Bagnardi, M.: dMODELS: An Open-Source, Modular MATLAB Environment for Geodetic Deformation Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2634, https://doi.org/10.5194/egusphere-egu26-2634, 2026.

EGU26-2942 | ECS | PICO | ESSI2.8

Developing a User-Friendly Tsunami Risk Assessment Tool for Ports: Application to Cilegon Port, Indonesia 

Raquel Felix, Constance Chua, Anawat Suppasri, Ignatius Ryan Pranantyo, and Endra Gunawan

More than 80% of the world’s international trade is conducted via maritime transport, with ports serving as critical gateways for the transfer of goods between sea and land. In this research, we introduce a user-friendly tsunami risk analysis application for ports, developed with a graphical user interface. The application utilises existing published fragility curves (for structural damage, recovery potential for production capacity, and physical loss estimation) to perform the analyses and generate a risk report. The required inputs in this application are a tsunami inundation map and a shapefile containing the polygonal structures of the port, with an attribute table that includes the industry type identification. The main output of the application is a PDF report containing the probability distribution results across different tsunami inundation depth ranges, presented through inundation maps, summary tables, and bar plots. All raw image files used in the report, as well as the raw calculations in CSV format, are also included as part of the output. Preliminary testing of the application has been conducted to forecast tsunami impacts at Cilegon Port in West Java, Indonesia, under a worst-case scenario involving a Mw 8.9 earthquake from a rupture along the Sunda Megathrust, located southwest of the Sunda Strait. Cilegon Port lies on the north-western coast of Java Island, facing the strait. We will present the latest progress in developing our risk assessment application. This research is funded by The European Commission (the Horizon Europe scheme) and the UK Research and Innovation (EPSRC contract: EP/Z001080/1).

How to cite: Felix, R., Chua, C., Suppasri, A., Pranantyo, I. R., and Gunawan, E.: Developing a User-Friendly Tsunami Risk Assessment Tool for Ports: Application to Cilegon Port, Indonesia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2942, https://doi.org/10.5194/egusphere-egu26-2942, 2026.

EGU26-2972 | PICO | ESSI2.8

SHAppE After One Year: Community Feedback Driving Seismic Hazard Analysis Forward  

Fiona Zarodova, Andrew Redfearn, and Kostas Leptokaropoulos

SHAppE (Seismic HAzard Parameters Evaluation app) is a MATLAB-based App for time-dependent probabilistic seismic hazard analysis, designed to simplify data access, analysis and visualization workflows. Since its release in April 2025, SHAppE has been well received by researchers and educators, providing an intuitive interface for complex hazard evaluations without requiring advanced programming skills. 

While the initial version was fully functional, community feedback over the past year has been invaluable in refining the app. Over the past year, more than 50 issues (including bug fixes, usability improvements, and feature enhancements) were addressed, many reported directly by users. This collaborative process has led to significant upgrades, such as improved data selection workflows, expanded parameter set options, and the ability to extract the complete set of custom filters and applied parameters, further strengthening reproducibility. 

SHAppE also integrates with external sources like the ISC Earthquake Toolbox for MATLAB, enabling direct access to global earthquake bulletins without significant preprocessing. The community contributions have been critical to the App improvement and we encourage continued feedback to drive future development.  

SHAppE is freely available via: 

  • GitHub (https://github.com/mathworks/Seismic-HAzard-Parameters-Evaluation-Interface-SHAppE) and  
  • File Exchange (https://www.mathworks.com/matlabcentral/fileexchange/180879-shappe-seismic-hazard-parameters-evaluation-interface) 

 

How to cite: Zarodova, F., Redfearn, A., and Leptokaropoulos, K.: SHAppE After One Year: Community Feedback Driving Seismic Hazard Analysis Forward , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2972, https://doi.org/10.5194/egusphere-egu26-2972, 2026.

EGU26-3618 | ECS | PICO | ESSI2.8

Robust Software for the Modeling of Spatial Random Fields across Geoscience Disciplines 

Olivia L. Walbert, Frederik J. Simons, Arthur P. Guillaumin, and Sofia C. Olhede

We have developed theory, algorithmic tools, and two software suites (written in MATLAB and Python) that are openly available for use by the broad geosciences community for the statistical characterization of spatial datasets as finite, discrete random fields. Our software implements robust statistical methods that we have formulated for the simulation and estimation of stationary, isotropic, random fields on a potentially only partially observed grid within the Matérn class of parametric covariance functions. Parametric covariance models characterize the second-order structure of random fields by quantifying their shape through parameters for the amplitude, smoothness, and correlation length. Our tools allow for the analytical calculation of parameter uncertainty for modeled random fields that depend upon the parametric model and the sampling grid, agnostic of the data itself, allowing for the exploration of experimental design. Our software includes a plethora of visualization tools for studying spatial random fields and their sampling grids, including for interrogating the fit of a maximum-likelihood model (and its assumptions) to observed data. Our methodology is readily applicable for use by scientists from broad disciplines who work with (geo)spatial (ir)regularly gridded datasets.

We will present a workflow of our software to demonstrate through visualization the simulation, estimation, and analysis of spatial random fields. A typical modeling procedure for geoscientific applications involves spatial gridded data taken to be stationary, isotropic random fields under the null hypothesis. A single inversion routine estimates the Matérn covariance paramaters by optimizing the spectral-domain debiased Whittle likelihood, which involves the comparison between the modified periodogram and the parametric spectral density blurred by the effects of the observation window. We interpret the quality of our estimate, (1) by simulating additional realizations through a simulation routine that includes a circulant embedding approach, (2) by evaluating the goodness-of-fit of the model and its assumptions through multiple graphical- and test-statistic-based examinations of the model residuals, and (3) by quantifying parameter uncertainty by calculating their covariance from first principles, for which we have designed different implementations depending on the available hardware (prioritizing memory or speed). We provide documentation for multiple well-studied simulation, inversion, and analysis options with default functionality, version control, and extensive demos designed to familiarize users with not only the implementation of our tools, but also the underlying theory and its implications for their data. We share select case studies using real data that we hope will illuminate and inspire future applications, and provide a guide to our software.

Our open-source software is available on GitHub, and includes the MATLAB repositories github.com/csdms-contrib/slepian_juliet and github.com/csdms-contrib/slepian_lima, and the DSWL Python package github.com/arthurBarthe/debiased-spatial-whittle, which is in revision with the Journal of Open Source Software.

How to cite: Walbert, O. L., Simons, F. J., Guillaumin, A. P., and Olhede, S. C.: Robust Software for the Modeling of Spatial Random Fields across Geoscience Disciplines, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3618, https://doi.org/10.5194/egusphere-egu26-3618, 2026.

EGU26-5309 | PICO | ESSI2.8

A User-Centric Software Infrastructure for Geoscience Data Using IPFS and FAIR Digital Objects 

Marco Kulüke, Ivonne Anders, Karsten Peters-von Gehlen, Carsten Ehbrecht, Kameswar Rao Modali, and Hannes Thiemann

Climate and geoscience research increasingly relies on complex infrastructures and software to access, analyse, and reuse large and heterogeneous datasets. However, researchers often face fragmented data access, limited interoperability between platforms, and high entry barriers to cross-disciplinary data reuse. This conference contribution presents a user-centric infrastructure concept that combines the InterPlanetary File System (IPFS) software with the FAIR Digital Objects (FDO) standard to address these challenges and support intuitive research workflows.

At the core of the approach is the representation of geoscientific datasets as FAIR Digital Objects that bundle data and metadata into persistent and interoperable entities. From a user perspective, FDOs provide identifiers and provenance information that enable consistent discovery, access, and reuse of data across platforms and disciplines. Within this framework, IPFS acts as infrastructure software, providing a robust, decentralized, peer-to-peer, content-addressable file-sharing system that ensures data integrity, redundancy, and long-term accessibility, while being abstracted behind user-facing interfaces and workflows.

This infrastructure concept is illustrated through a user-driven test case derived from the ORCESTRA (Organized Convection and EarthCARE Studies over the Tropical Atlantic) campaign. ORCESTRA integrates satellite observations, airborne measurements, ground-based instrumentation, and climate model simulations, reflecting a wide variety of data sizes and types. User stories obtained from the campaign, such as comparing data from multiple sources, guided the design of the infrastructure concept. A demonstration shows how selected datasets were ingested into IPFS and exposed through an FDO-compliant catalogue, enabling unified access and seamless reuse across tools and platforms.

The presented test case illustrates how a user-driven IPFS-based software approach, together with the multidisciplinary FDO metadata standard, can be operationalized to enhance transparency, reproducibility, and hence, sustainability in geoscience research. By supporting interoperable and machine-actionable research assets, this infrastructure concept contributes to a more robust and future-ready geoscience software ecosystem. Beyond geoscience, this approach is transferable to other domains facing similar challenges in data-intensive, multi-instrument, and multi-model environments.

How to cite: Kulüke, M., Anders, I., Peters-von Gehlen, K., Ehbrecht, C., Modali, K. R., and Thiemann, H.: A User-Centric Software Infrastructure for Geoscience Data Using IPFS and FAIR Digital Objects, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5309, https://doi.org/10.5194/egusphere-egu26-5309, 2026.

EGU26-6642 | ECS | PICO | ESSI2.8

Replicability testing and scientific skill quantification in Earth System Models with pyhanami 

Marta Alerany Solé, Kai Keller, Chihiro Kodama, Masuo Nakano, Tomoe Nasuno, Daisuke Takasuka, and Mario Acosta

As climate models advance toward higher resolutions, they become increasingly capable of resolving key Earth system processes, which in turn raises the need for robust and quantitative evaluation methods. In response to this challenge, we present pyhanami, an open-source Python package developed within the HANAMI project to assess the replicability and scientific skill of Earth System Models (ESMs) using statistical testing and objective, scalar-based metrics. Besides, to facilitate the practical application of these evaluations, pyhanami features a structured data interface that efficiently loads and inspects compatible model outputs.

An ESM is considered replicable if the same experiment run on different computing environments or with different compilers produces identical results, i.e., representing the same climate. This ensures that differences between simulations reflect only the intended scientific changes in the model setup. Because bit-for-bit replicability is often unattainable across environments due to the chaotic nature of climate models, our practical goal is to achieve statistical indistinguishability. Building on existing methodologies, pyhanami provides an ensemble-based replicability test that combines multiple statistical tests and metrics to determine whether two simulated ensembles are statistically indistinguishable, as described in (K.Keller et al., 2025; doi.org/10.5194/gmd-18-10221-2025). To the best of our knowledge, automated and standardized replicability assessment is not currently supported in model evaluation tools, despite its importance for climate model development, validation, intercomparison, and porting. 

Complementing replicability, the scientific skill of an ESM describes its ability to accurately reproduce observed features of the climate system, from regional patterns to large-scale teleconnections. Many existing tools to evaluate this skill rely on visualization-based diagnostics, which often require expert knowledge and can be biased by subjective interpretation. In contrast, scalar metrics and scores provide quantitative and comparable measures of scientific skill, which are essential for interpreting climate projections, guiding model development, and model intercomparison. However, diagnostics for physical processes that require km-scale, high-resolution global climate models to be properly resolved remain underrepresented in state-of-the-art diagnostic suites. Although several metrics have been proposed for such small-scale processes, many lack standardized and widely available implementations. As high-resolution climate simulations become more common, the demand for objective diagnostics to support model tuning and improvement is increasing. pyhanami addresses this need by providing a growing set of scalar scientific skill metrics that enable quantitative and easily interpretable evaluation of phenomena such as Tropical Cyclones and the Tropical Intraseasonal Oscillation (ISO), including the Madden-Julian Oscillation and the Boreal Summer ISO modes. 

By integrating replicability testing, scientific skill metrics, and visualization tools into a single, self-contained package with a generic data interface, pyhanami streamlines evaluation workflows and supports the development of reliable climate projections, advancing the quality and reproducibility of geosciences research.

How to cite: Alerany Solé, M., Keller, K., Kodama, C., Nakano, M., Nasuno, T., Takasuka, D., and Acosta, M.: Replicability testing and scientific skill quantification in Earth System Models with pyhanami, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6642, https://doi.org/10.5194/egusphere-egu26-6642, 2026.

EGU26-7704 | PICO | ESSI2.8

Firedrake - automated, differentiable building blocks for geoscientific simulation 

David Ham, Connor Ward, Pablo Brubeck, Joshua Hope-Collins, and Leo Collins
Computer simulations of continuous processes described by partial differential equations are a bedrock of geoscientific simulation. Each simulation is a complex composition of equations, discretisations, solvers and parameterisations. Realistic geoscientific simulation also depends on the integration of observed data either as facing functions or through data assimilation. The result of this complexity is that creating new models, or even extending existing ones, can often be exceptionally resource intensive, even for large and highly capable institutions.
 
Firedrake (https://www.firedrakeproject.org/) offers a revolutionary different approach to model creation. Rather than coding the implementation of a model in low level code in a compiled language, Firedrake users write the mathematical formulation of their model in high-level Python. The high performance, parallel implementation of that code is then automatically generated and executed. Users have access to:
 
  • A huge range of finite element discretisations for any PDE they choose, including generalisations of the various variables staggerings that are typically used across the geosciences.
  • Programmable, composable solvers and preconditioners, including algebraic and geometric multigrid approaches, and physics-based preconditions based on the characteristics of the system being solved.
  • Seemless coupling to external processes, including the ML frameworks JAX and PyTorch.
  • Fully automated adjoint computations: the adjoint to a Firedrake simulation is available with no additional coding required.
  • Integration with optimisation algorithms for data assimilation.
 
Firedrake already provides the basis for:
  • The GUSTO toolkit, used for dynamical core development research at the Met Office and University of Exeter (https://www.firedrakeproject.org/gusto/).
  • The Thetis coastal ocean model (Kärna et al. 2018)
  • G-ADOPT: The Geoscientific ADjoint Optimisation PlaTform for mantle convection and glacial isostatic adjustment from the Australian National University (Ghelichkhan et al 2024). 
As well as hundreds of bespoke simulations by users around the world.
 
This PICO will present the key features of Firedrake and illustrate the applications to which it is put.
 
References
 
Ghelichkhan, Sia, et al. "Automatic adjoint-based inversion schemes for geodynamics: reconstructing the evolution of Earth's mantle in space and time." Geoscientific Model Development 17.13 (2024): 5057-5086.
Kärnä, Tuomas, et al. "Thetis coastal ocean model: discontinuous Galerkin discretization for the three-dimensional hydrostatic equations." Geoscientific Model Development11.11 (2018): 4359-4382.

 

How to cite: Ham, D., Ward, C., Brubeck, P., Hope-Collins, J., and Collins, L.: Firedrake - automated, differentiable building blocks for geoscientific simulation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7704, https://doi.org/10.5194/egusphere-egu26-7704, 2026.

EGU26-7901 | PICO | ESSI2.8

An enhanced version of the NESTORE software for strong aftershock forecasting 

Stefania Gentili, Letizia Caravella, and Giuseppe Davide Chiappetta

In this work, we present a new and improved version of the NExt STrOng Related Earthquake (NESTORE) software, originally released as NESTOREv1.0 and publicly available as a MATLAB-based package. The original version of NESTORE was specifically designed to forecast the occurrence of strong aftershocks in the first few hours following a mainshock, providing a valuable tool for short-term seismic hazard assessment.

 

The newly developed version introduces several methodological and computational improvements aimed at increasing the robustness and reliability of the forecasting framework. Among the main upgrades is the integration of the REPENESE (RElevant features PErcentage class weighting NEighborhood detection SElection) algorithm, an advanced outlier detection method explicitly designed to handle class imbalance and skewed datasets, which are characteristic of the seismicity features we used. This integration enables a more effective identification and treatment of anomalous events, thereby improving classifier performance.

 

In addition, the new version implements a k-fold cross-validation strategy to estimate model performance. This approach allows a more stable and unbiased evaluation of predictive capabilities compared to single-split validation methods, especially with limited or heterogeneous data. Overall, the combination of these enhancements results in a more flexible, accurate, and reliable tool for the analysis of earthquake clusters and the early forecasting of strong aftershocks.

 

Funded within the RETURN Extended Partnership and received funding from the European Union Next-Generation EU (National Recovery and Resilience Plan—NRRP, Mission 4, Component 2, Investment 1.3—D.D. 1243 2/8/2022, PE0000005) and by the grant “Progetto INGV Pianeta Dinamico: NEar real-tiME results of Physical and StatIstical Seismology for earthquakes observations, modelling and forecasting (NEMESIS)” - code CUP D53J19000170001 - funded by Italian Ministry MIUR (“Fondo Finalizzato al rilancio degli investimenti delle amministrazioni centrali dello Stato e allo sviluppo del Paese”, legge 145/2018).

 

How to cite: Gentili, S., Caravella, L., and Chiappetta, G. D.: An enhanced version of the NESTORE software for strong aftershock forecasting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7901, https://doi.org/10.5194/egusphere-egu26-7901, 2026.

EGU26-11926 | PICO | ESSI2.8

NaturaSat: Software tools for exploration and monitoring of Natura 2000 habitats using multi-source Earth observation data 

Michal Kollár, Martin Ambroz, Aneta A. Ožvat, Karol Mikula, Mária Šibíková, and Jozef Šibík

This contribution presents the software tools provided by NaturaSat [1], a robust and user-friendly application for the exploration and monitoring of Natura 2000 habitats using multi-source Earth observation data. The software enables users to visualize and jointly analyze Sentinel-2 and Sentinel-1 imagery, orthophotos, and UAV data within a single working environment.

We showcase the main functionalities of the software, including data import and management, interactive visualization, semi-automatic and automatic segmentation of input data, and spatio-temporal comparison of habitat boundaries. These tools support habitat mapping and allow users to track changes in habitat extent and ecological condition. In addition to segmentation and monitoring, the software also includes tools for classification, transformation of historical maps, and basic hydrological modeling.

The contribution focuses on the practical use of NaturaSat as a research and operational tool for botanists and environmental scientists. A case study illustrates typical user workflows, showing how the software combines different tools in an accessible way to support the analysis of habitat structure and change.

[1] NaturaSat, http://www.algoritmysk.eu/en/naturasat_en/

How to cite: Kollár, M., Ambroz, M., Ožvat, A. A., Mikula, K., Šibíková, M., and Šibík, J.: NaturaSat: Software tools for exploration and monitoring of Natura 2000 habitats using multi-source Earth observation data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11926, https://doi.org/10.5194/egusphere-egu26-11926, 2026.

EGU26-12280 | PICO | ESSI2.8

SUNSET: Addressing key challenges for the successful provision of climate services 

Victoria Agudetse, Núria Pérez-Zanón, Ariadna Batalla, Carlos Delgado-Torres, Alberto Bojaly, Pierre-Antoine Bretonnière, Javier Corvillo, Eren Duzenli, Theertha Kariyathan, Aleksander Lacima-Nadolnik, Alba Llabrés-Brustenga, Bruno de Paula Kinoshita, Paloma Trascasa-Castro, Verónica Torralba, Albert Soret, and Francisco Javier Doblas-Reyes

Climate services leverage state-of-the-art knowledge, data and tools from the climate sciences to tailor services to user needs. To do so, climate service scientists ensure scientifically robust and traceable analyses to address specific, co-produced applications in sectors such as energy, agriculture and health. However, the diversity of post-processing methodologies applied to climate datasets, together with the variety of data sources (e.g. reanalyses, in situ observations, and climate predictions across different forecast horizons) and heterogeneous user requirements, makes the development and long-term maintenance of the required software a major challenge for the timely delivery of climate products that fulfill those user needs.

The SUbseasoNal to decadal climate forecast post-processing and asSEssmenT (SUNSET) is a software suite developed by the Earth Sciences Department at the Barcelona Supercomputing Center, building on extensive expertise in state-of-the-art climate science, climate service co-production and software development for HPC environments. SUNSET integrates in-house R-based software packages, including CSTools, CSDownscale and s2dv, which implement established methodologies for climate forecast post-processing, such as bias adjustment, statistical downscaling, verification and visualisation. The suite addresses key challenges commonly faced by climate service scientists, including the management of multiple forecast systems and reference datasets, the alignment of temporal dimensions (e.g. initialisation dates and forecast lead times with respect to reference datasets), and the consistent handling of hindcasts and observations to enable robust and comparable verification through cross-validation approaches. When requested, SUNSET uses the Autosubmit workflow manager to parallelise and orchestrate multiple workflows, ensuring efficient use of computational resources and the timely generation of climate products.

SUNSET currently delivers near real-time operational climate products, including the probability of the most likely tercile, the probability above or below specific percentiles, and absolute thresholds for essential climate variables. These products can also be tailored to sector-specific indicators, such as the growing degree days required by the agriculture sector. SUNSET verification workflows support the evaluation of the next generation of Copernicus Climate Change Service seasonal forecast systems within the CERISE project by providing comprehensive skill metrics and scorecard summaries. Together, these capabilities ensure successful research and service delivery in several projects, including ASPECT, BigPrediData, and BOREAS.

SUNSET is open-source and hosted in a public GitLab repository, following a structured development strategy with regular releases, continuous integration, and a dedicated conda environment to ensure reproducibility and long-term sustainability. Ongoing and future developments focus on extending methodological capabilities, improving usability, and optimising memory usage and workflow multi-node parallelisation for efficient execution on HPC systems.

How to cite: Agudetse, V., Pérez-Zanón, N., Batalla, A., Delgado-Torres, C., Bojaly, A., Bretonnière, P.-A., Corvillo, J., Duzenli, E., Kariyathan, T., Lacima-Nadolnik, A., Llabrés-Brustenga, A., de Paula Kinoshita, B., Trascasa-Castro, P., Torralba, V., Soret, A., and Doblas-Reyes, F. J.: SUNSET: Addressing key challenges for the successful provision of climate services, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12280, https://doi.org/10.5194/egusphere-egu26-12280, 2026.

EGU26-13219 | PICO | ESSI2.8

An Interoperable Web Platform for Water Quality Monitoring and Remediation in the Mediterranean: the iWIRE Platform 

Marco Micotti, Elena Matta, Elisa Bozzolan, Simone Corti, Davide Cantaluppi, and Enrico Weber

Chemical pollution poses a growing threat to aquatic ecosystems and water resources across the Mediterranean region, driven by contaminants of emerging concern and complex land–sea interactions. Addressing this challenge requires tools capable of integrating datasets at multiple spatial and temporal scales and user-friendly interfaces to support knowledge sharing across diverse territorial and social contexts. Within this framework, the Water Information and Remediation Platform (iWIRE) has been developed as part of the EU Horizon Europe project iMERMAID (Innovative solutions for Mediterranean Ecosystem Remediation via Monitoring and decontamination from Chemical Pollution).

iWIRE is a web-based platform designed to collect, harmonise, and visualise environmental and water quality data, providing a unified entry point to explore site-specific characteristics, monitoring data, climate conditions, and the performance of remediation activities.

From a research software perspective, iWIRE addresses key challenges related to reproducibility, interoperability, and usability in geoscientific platforms. The system is built on a fully open-source technology stack and follows a modular design, allowing individual components to be updated, extended, or reused across different projects and environmental contexts.

The platform supports interoperability by integrating heterogeneous datasets from laboratory analyses, in situ sensors, climate services, satellite-derived datasets, and regulatory sources. Data ingestion is enabled through a wide range of input formats, from simple text files to standardised data structures and API-based connections. This approach enables seamless data exchange between tools and facilitates cross-disciplinary analyses spanning hydrology, environmental monitoring, and water treatment processes.

iWIRE has been tested across five Mediterranean pilot areas, providing concrete case studies that demonstrate its operational use in real-world environmental monitoring and remediation assessment. These examples highligh how research software can effectively bridge scientific analysis and decision-making in applied geoscience contexts.

The platform software architecture combines a modular Content Management System with interactive data-visualisation dashboards. Public-facing content and access management are handled through a Drupal-based frontend, while use-case dashboards are developed in Redash and dynamically connected to structured datasets hosted in relational databases, online spreadsheets, or accessed via APIs. This architecture enables near-real-time updates, flexible data integration, and consistent visualisation across heterogeneous data sources.

To address data-sensitivity constraints, the platform supports differentiated access levels, combining publicly accessible dashboards with restricted views for confidential wastewater treatment plant data.

How to cite: Micotti, M., Matta, E., Bozzolan, E., Corti, S., Cantaluppi, D., and Weber, E.: An Interoperable Web Platform for Water Quality Monitoring and Remediation in the Mediterranean: the iWIRE Platform, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13219, https://doi.org/10.5194/egusphere-egu26-13219, 2026.

EGU26-13691 | PICO | ESSI2.8

GITpy: An open-source, data-driven framework for robust ground-motion parameter estimation across tectonic settings 

Maria D'Amico, Paola Morasca, Spallarossa Daniele, Bindi Dino, Picozzi Matteo, Oth Adrien, and Pacor Francesca

Quantifying earthquake ground motion requires the robust separation of source, propagation, and site effects, a long-standing challenge in seismology and seismic hazard studies. We present GITpy, an open-source Python framework that implements the non-parametric Generalized Inversion Technique (GIT), designed to flexibly and efficiently decompose S-wave Fourier amplitude spectra into their fundamental physical components. The software provides a unified, reproducible, and highly configurable environment for performing large-scale GIT-based inversions and for exploring the associated modeling choices.

GITpy adopts a one-step, data-driven inversion strategy that minimizes a priori assumptions and exploits the increasing availability of dense, high-quality seismic datasets. Its modular, object-oriented design allows users to customize all key aspects of the inversion workflow and to independently parameterize source and path terms, controlling source and attenuation complexity through flexible choices and solution constraints.

Post-inversion modules allow independent parameterization of attenuation functions and source spectra, as well as extraction of station-specific apparent source spectra. These capabilities make GITpy suited not only for deriving ground-motion parameters, but also for systematically diagnosing model performance and quantifying trade-offs between source, path, and site terms. The software architecture is designed to facilitate reproducible research, with standardized input/output formats, configuration files, and plotting routines that can be readily shared and extended.

GITpy is conceived as a general-purpose tool, and we demonstrate its performance through a validation test in Central Italy, where we compare inversion results with those obtained using the widely adopted GIT implementation of Oth et al. (2011). This benchmark confirms the robustness of the inversion core and illustrates how GITpy can reproduce established results while offering greater flexibility in model setup and analysis. Ongoing applications in northeastern Italy (Cataldi et al. 2026) further showcase its ability to manage complex tectonic settings and to explore spatial variations in source-related parameters, without requiring changes to the core inversion engine.

By integrating source, attenuation, and site analyses within a single open-source framework, GITpy promotes methodological consistency in ground-motion studies and facilitates comparisons across regions and tectonic environments. GITpy is under active development, with new modules and functionalities continuously added, including tools for systematic residual analysis and a dedicated module for the local calibration of the ML scaling law. It is intended as a community resource for seismological research, seismic hazard assessment, and ground-motion modeling.

How to cite: D'Amico, M., Morasca, P., Daniele, S., Dino, B., Matteo, P., Adrien, O., and Francesca, P.: GITpy: An open-source, data-driven framework for robust ground-motion parameter estimation across tectonic settings, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13691, https://doi.org/10.5194/egusphere-egu26-13691, 2026.

EGU26-14696 | PICO | ESSI2.8

Visualisation of 3D geoscientific campaign data for the GeoLaB 

Karsten Rink, Özgür Ozan Sen, Felix Raith, Fiorenza Deon, Nadine Haaf, Marcel Horovitz, Stefan Lüth, Edinsson Munoz, Bastian Rudolph, Christoph Schüth, Ingo Sass, Thomas Kohl, and Olaf Kolditz

GeoLaB is an underground research laboratory (URL) currently in planning stage, focussing on deep geothermal energy production in crystalline rock. Accompanying the planning, a virtual geographic environment is being built and updated with datasets as they become available. The focus of this data integration process are currently the multiple exploration campaigns that ensure that the planned site in southern Hesse, Germany, is suitable for building the URL and conducting experiments. Over the past two years, several  campaigns have been funded to acquire detailed seismic, geophysical, magnetic, and hydrological information in the area around the Tromm mountain ridge in the German Odenwald region. In addition, two exploration wells with a depth of 500 m have been drilled to gain knowledge about the structure of the crystalline rock and to ensure the selected site is suitable for the construction of an underground lab.
3D representations of acquired datasets have been created and are visualised in a unified geographic context in combination with datasets provided by state offices, such as fracture networks, topographic maps, buildings or protection areas, as well as geological information to gain new and detailed insights about geotechnical and hydrogeological conditions in this region. In addition, a hydrogeological simulation already provides information on groundwater and saturation and a structural model is currently set up for running coupled THM simulations. Our framework is based on VTK, with workflows for data processing, conversion, modelling and visualisation developed within the OpenGeoSys community. With over 500 datasets already gathered in the scope of the project, data management is handled by KADI (Karlsruhe Data Infrastructure), an open-source solution developed at KIT.
This contribution focusses on the combined 3D visualisation of campaign data acquired during the site selection process and aims primarily at planning and stakeholder information. As the project progresses, this will be expanded into a functional digital twin of the URL and all experiments as well as the surrounding area.

References:
Bremer, J., Kohl, T., Sass, I., Kolditz, O., Rudolph, B., Rühaak, W., Köbe, W., Dehmer, D., Schamp, J., Grimmer, J.C., Scheuvens, D., Schüth, C., Deon, F., Lüth, S., Haaf, N., Hoffert, U., Milsch, H., Giese, R., Zimmermann, G., Könitz, D., Rink, K., Şen, Ö.O., Goldstein, S., Jahn, M.W., Steinhülb, J., Bauer, F., Selzer, M., Schätzler, K. (2025):
GeoLaB annual report 2024. GeoLab, Karlsruhe, 126 pp. 10.5445/IR/1000184950

Kohl, T., Sass, I., Kolditz, O., Bremer, J., Rudolph, B., Schill, E. (2023):
The Large-Scale Helmholtz Research Infrastructure GeoLaB. Proc. of 48th Workshop on Geothermal Reservoir Engineering, Stanford, California. SGP-TR-224

How to cite: Rink, K., Sen, Ö. O., Raith, F., Deon, F., Haaf, N., Horovitz, M., Lüth, S., Munoz, E., Rudolph, B., Schüth, C., Sass, I., Kohl, T., and Kolditz, O.: Visualisation of 3D geoscientific campaign data for the GeoLaB, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14696, https://doi.org/10.5194/egusphere-egu26-14696, 2026.

EGU26-15074 | ECS | PICO | ESSI2.8

EZ-InSAR-3: an open-source InSAR environment in Python 

Alexis Hrysiewicz and Eoghan Holohan

In geosciences, geodesy and geotechnical engineering, Interferometric Synthetic Aperture Radar (InSAR) has demonstrated its ability to estimate, at millimetre scale, displacements of the Earth's ground surface. Although open-source SAR/InSAR software packages are robust, they are often not user-friendly, as users must have in-depth knowledge of SAR/InSAR methods, as well as computer skills. In addition, multiples software packages and scripts are often required for a complete workflow, which can make it difficult to meet to FAIR principles and to perform efficient data manipulation/analysis. EZ-InSAR is a versatile, user-friendly and open-source environment for SAR/InSAR computations that is now available in Python. Bridging several renowned open-source SAR/InSAR processors, EZ-InSAR now includes all the tools needed to perform complete SAR/InSAR time-series processing in a single environment. For example, automatic SAR imagery downloading, options for different time-series approaches, and tools for data visualisation and verification are provided. The new structure of EZ-InSAR, which is built with mandatory and optional EZ-InSAR Python modules, has been designed to facilitate community-led bug fixes, updates, testing, and rapid development. Users can now perform complex SAR/InSAR workflows in EZ-InSAR by implementing the toolkit in their Python scripts, by using the EZ-InSAR command line interface or by using EZ-InSAR’s evolved Graphical User Interface. All processing parameters are managed directly in the EZ-InSAR environment to ensure compliance with FAIR principles. The toolkit is also supported by comprehensive documentation. During the PICO session, we will show the use of EZ-InSAR for a complete computation of ground surface displacements at Campi Flegrei Caldera, Italy. This will highlight not only the efficiency of EZ-InSAR for monitoring of geohazards, but also why it is suitable for both new users of satellite Earth Observation data and expert users in SAR/InSAR remote sensing.

How to cite: Hrysiewicz, A. and Holohan, E.: EZ-InSAR-3: an open-source InSAR environment in Python, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15074, https://doi.org/10.5194/egusphere-egu26-15074, 2026.

EGU26-16635 | PICO | ESSI2.8

Earthquake Explorer goes 3D: A Browser-Based Tool for Interactive Earthquake Visualization 

Christian Meeßen, Matthias Volk, Nils Brinckmann, Joachim Saul, and Frederik Tilmann

The moment tensor of an earthquake describes the force couples acting at the source location as a symmetric 3x3 matrix, and provides information on orientation and slip direction of the fault failure. Shear failure is represented by a double-couple, for which the trace of the moment tensor is zero and the intermediate eigenvalue is zero, with the corresponding eigenvector termed the neutral axis. In the seismology literature, this tensor is typically visualized in a two-dimensional beachball diagram, a lower-hemisphere projection of a three-dimensional sphere split into four quadrants by two perpendicular great circles oriented according to the eigenvectors of the matrix, such that the neutral axis points to the intersection of the two circles, and the other two eigenvectors point to the centre of each quadrant.

In this contribution we present a new browser-based tool to visualize the moment tensors of earthquakes not just as a two-dimensional projection but as three-dimensional objects. Concretely we show an earthquake as a magnitude-scaled sphere, textured according to its moment tensor, and located at its hypocenter. We provide options to visualize only the double-couple part of the moment tensor, to color the spheres by depth only and to also show earthquakes for which no moment tensor solution has been derived. Additional context is provided by the SLAB2 model of the subduction zones of the earth as well as raster and vector map layers loaded via OGC compliant APIs (WMS/WFS).

The tool runs entirely in a user’s web browser and fetches earthquake data from an FDSN Web Services (FDSNWS) events endpoint in QuakeML format, thus using a standardized API widely-used in the seismology community. While our deployment is currently integrated into the GFZ Earthquake Explorer, using the GEOFON network, it is also compatible with other station networks that provide data via FDSNWS. The visualization is built using the open-source library CesiumJS with custom WebGL shaders that implement the coloring of the spheres as beachballs.

The fact that no specialized software packages are needed also makes the tool suitable for a more general audience beyond scientists from the seismology community.

How to cite: Meeßen, C., Volk, M., Brinckmann, N., Saul, J., and Tilmann, F.: Earthquake Explorer goes 3D: A Browser-Based Tool for Interactive Earthquake Visualization, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16635, https://doi.org/10.5194/egusphere-egu26-16635, 2026.

EGU26-17152 | ECS | PICO | ESSI2.8

Interactive Voxel Visualization of Large Earth System Data Cubes 

Maximilian Söchting and Miguel D. Mahecha

As Earth system data streams and models grow larger, more complex and higher-dimensional, the demand for capable data visualization and exploration tools increases. While specialized data cube visualization tools have been developed in recent years, they typically rely on technical compromises to address the data access problems posed by large data sets. Some of the existing tools provide support for visualizing arbitrary 3D data chunks by making parts of the cube transparent, commonly known as volume or voxel rendering, i.e., “looking inside the data set”. This voxel rendering can communicate spatiotemporal patterns effectively and has a much higher information density of previous data cube visualization approaches, but is computationally demanding and scales strongly with the size of the visualized data set. 

Here we present an interactive voxel visualization for large Earth system data cubes, integrated into the existing Lexcube.org data cube visualization and its open-source Python package. The voxel visualization allows to highlight and visualize value ranges based on thresholds, creating "voxel clouds" in three-dimensional space-time. Additionally, users can highlight extreme values by selecting a quantile range, based on deviations from the mean seasonal cycle and other definitions of “extreme”. To enable this visualization for large data sets, we developed a novel lossy compression algorithm based on variable quantization of 3D blocks that significantly reduces both the required VRAM for the visualization and the computational effort for the ray tracing. The algorithm preserves high information content by encoding 3D chunks of high variance at a high resolution, while chunks of nearly uniform values get compressed a lot, respecting a user-set, configurable error metric. This way, the scientific accuracy of the visualization is guaranteed and quantified, while enabling the previously impossible voxel visualization and exploration of large data sets.

Based on the previous Lexcube software, the software stays compatible with a wide range of desktop and mobile devices by relying on WebGL 2 instead of adopting the modern successor WebGPU. Because the data backbone relies on Xarray, any gridded three-dimensional Zarr, NetCDF and other supported data sets can be ingested and visualized with our software - on Lexcube.org or using our open-source package for Jupyter notebooks, available on Github and PyPi.

How to cite: Söchting, M. and Mahecha, M. D.: Interactive Voxel Visualization of Large Earth System Data Cubes, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17152, https://doi.org/10.5194/egusphere-egu26-17152, 2026.

EGU26-17193 | PICO | ESSI2.8

Integrating numerical models and terrain analysis in TopoToolbox 

Wolfgang Schwanghart, Boris Gailleton, Anna-Lena Lamprecht, Dirk Scherler, and Kearney William

Many numerical models depend critically on digital elevation models (DEM). Hydrodynamic models, landslide susceptibility models, or glacial models, for example, require DEMs as input. The quality of the DEMs and DEM preprocessing are thus vital for many models. Hydrodynamic simulations are highly sensitive to DEM errors and artefacts, while other models require smoothing to ensure numerical stability. Model outputs are likewise strongly controlled by topography: flood extents depend on topographic gradients and surface elevations relative to river channels, glacier extent is constrained by topographic height and confinement, and geomorphic models generate time-varying DEMs that document topographic changes over time. These close links suggest that model preparation, execution, and analysis should be conducted within a unified terrain-analysis environment.

TopoToolbox is a terrain analysis framework originally developed in MATLAB and now available in Python and R. We demonstrate how simulation software can be integrated into TopoToolbox by various means including via the C API of libtopotoolbox or higher level interfaces provided by the language-specific implementations of TopoToolbox. Interfacing with TopoToolbox enables seamless DEM preprocessing alongside visualization and analysis of model outputs. Working within a single, specialized environment simplifies workflows, improves reproducibility, and enhances the usability and dissemination of modeling software.

How to cite: Schwanghart, W., Gailleton, B., Lamprecht, A.-L., Scherler, D., and William, K.: Integrating numerical models and terrain analysis in TopoToolbox, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17193, https://doi.org/10.5194/egusphere-egu26-17193, 2026.

EGU26-18183 | ECS | PICO | ESSI2.8

SafeBridge: open-source software to translate InSAR time series into actionable damage indicators for infrastructure monitoring 

Murat Şahin, Valentina Macchiarulo, Hao Kuai, Pantelis Karamitopoulos, and Giorgia Giardina

In the geosciences, the growing availability of multi-temporal satellite products has created new opportunities for monitoring the condition of the built environment. However, transforming large volumes of time series data into actionable information for decision-making remains a major challenge. This difficulty is particularly acute for infrastructure managers who must combine remotely sensed observations with geospatial network inventories to evaluate the performance and deterioration of existing assets. To address this gap, we developed the SafeBridge software package. 
SafeBridge supports the derivation of bridge damage indicators by processing Multi-Temporal Interferometric Synthetic Aperture Radar (InSAR) time series through geospatial operations tailored to individual bridge assets within a network. The package offers a fast and efficient framework for computing structural health indicators, featuring workflows that can run on either high-performance computing (HPC) systems or standard, readily available hardware when HPC resources are unavailable. 
To lower the barriers for new users and facilitate communication of reproducible methods, SafeBridge includes documentation, an example-driven tutorial, synthetic demonstration datasets, and automated report generation. We describe in detail an end-to-end workflow that incorporates infrastructure geometries and MT-InSAR time series, performs topology-aware geospatial processing, and produces comprehensible damage indicators and summary outputs appropriate for screening, prioritisation, and downstream integration on transportation assets and networks. 
SafeBridge package provides a practical route from research code to reusable software, while maintaining scientific transparency and reproducibility. We contribute reusable interoperable software building blocks for infrastructure-focused Earth Observation applications and highlight best practices for user-centric research software dissemination by making this tool available under open licenses with clear APIs and useful examples. 

How to cite: Şahin, M., Macchiarulo, V., Kuai, H., Karamitopoulos, P., and Giardina, G.: SafeBridge: open-source software to translate InSAR time series into actionable damage indicators for infrastructure monitoring, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18183, https://doi.org/10.5194/egusphere-egu26-18183, 2026.

EGU26-18279 | PICO | ESSI2.8

EqSimage: A Python Package for Fault Imaging from Earthquake Similarity 

Monika Staszek, Jan Wiszniowski, Paulina Kucia, Jakub Kokowski, Grzegorz Lizurek, and Łukasz Rudziński

Imaging of underground structures is a primary objective of all geophysical methods. In areas exhibiting natural or anthropogenic seismic activity, recorded and accurately located earthquakes can be used to map the faults that host them. Initial images can be further improved by the use of relative relocation techniques and seismic events with highly similar waveforms (multiplets).

In this work, we present EqSimage, a Python package designed to identify multiplets, perform their relative relocation using double-difference technique, and delineate potential fault planes. The package supports both continuous and triggered seismic data, which can be read directly from disk or downloaded from data centers using FDSN web services or ArcLink. Signal similarity is evaluated through cross-correlation of three-component seismic data. To distinguish groups of similar events, several clustering algorithms are available, including SciPy hierarchical clustering with cross-correlation coefficient as a distance metric and clustering based on pick times only. Subsequently, the double-difference relocation of all identified multiplets is carried out using the original hypoDD software by Waldhauser (2001), version 2.1b. Finally, the relocated events are divided into groups and best-fitting plane is determined for each group using the FaultNVC software (Sawaki et al., 2025).

EqSimage performs all processing steps automatically based on a single configuration file. The output includes a relocated earthquake catalog in QuakeML format and estimated fault-plane orientations. Additionally, several visualization tools are provided at individual stages of the workflow in order to assess the performance of configuration parameters. These tools include visualization of identified multiplets (waveforms and relocated hypocenters), cross-correlation matrices, relocated events, and inferred fault planes.

We demonstrate the capabilities of EqSimage using several datasets representing different types of anthropogenic seismicity, including injection induced seismicity, reservoir triggered seismicity and mining induced seismicity. Case studies are presented from The Geysers geothermal field (California, USA), the Song Tranh 2 water reservoir (Vietnam), and a seismically active underground mine in Poland.

 

References:

Waldhauser, Felix, hypoDD -- A program to compute double-difference hypocenter locations, U.S. Geological Survey Open-File Report 01-113, 2001.

Sawaki, Y., Shiina, T., Sagae, K., Sato, Y., Horikawa, H., Miyakawa, A., Imanishi, K., & Uchide, T. (2025). Fault Geometries of the 2024 Mw 7.5 Noto Peninsula Earthquake From Hypocenter-Based Hierarchical Clustering of Point-Cloud Normal Vectors. J. Geophys. Res.: Solid Earth, 130(4), e2024JB030233.

This research was supported by research project no. 2022/45/N/ST10/02172, funded by the National Science Centre, Poland, under agreement no. UMO-2022/45/N/ST10/02172. This work was also partially supported by a subsidy from the Polish Ministry of Education and Science for the Institute of Geophysics, Polish Academy of Sciences.

How to cite: Staszek, M., Wiszniowski, J., Kucia, P., Kokowski, J., Lizurek, G., and Rudziński, Ł.: EqSimage: A Python Package for Fault Imaging from Earthquake Similarity, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18279, https://doi.org/10.5194/egusphere-egu26-18279, 2026.

EGU26-18627 | ECS | PICO | ESSI2.8

FaulTED: a new user-friendly MATLAB-based code to assess probabilistic fault displacement hazard 

Selina Bonini, Oona Scotti, Alessandro Valentini, Francesco Visini, Bruno Pace, Giulia Tartaglia, Giulio Viola, and Gianluca Vignaroli

Probabilistic Fault Displacement Hazard Analysis (PFDHA) quantifies the probability and the expected amount of coseismic displacement associated with the activity of Active and Capable Faults (ACFs) at a given site. Common PFDHA approaches distinguish between primary on-fault displacement and distributed off-fault ruptures occurring on secondary faults or fractures and typically rely on empirical scaling relationships calibrated for specific earthquake magnitudes. However, these methods are often calibrated for specific tectonic or kinematic settings and lack readily available computational tools. Moreover, available PFDHA approaches do not commonly include the possibility to investigate the floating rupture mechanism, i.e., the possibility that surface ruptures involve only portions of the full fault trace.

To overcome these limitations, we developed FaulTED, a new user-friendly MATLAB-based code for PFDHA that integrates a comprehensive set of published models, including magnitude–frequency distributions, fault scaling relationships, and surface rupture probability functions. The toolkit comprises two main modules: (i) a site-specific hazard curve calculator and (ii) a fault-specific hazard map generator for user-defined return periods. Both modules explicitly account for on-fault and distributed off-fault ruptures and incorporate the floating rupture approach commonly adopted in probabilistic seismic hazard analysis.

The modular architecture of FaulTED allows users to flexibly select and compare alternative models through a structured input file, enabling sensitivity analyses and systematic exploration of epistemic uncertainties. FaulTED is designed as a user-oriented platform to support infrastructure planning in regions affected by ACFs.

How to cite: Bonini, S., Scotti, O., Valentini, A., Visini, F., Pace, B., Tartaglia, G., Viola, G., and Vignaroli, G.: FaulTED: a new user-friendly MATLAB-based code to assess probabilistic fault displacement hazard, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18627, https://doi.org/10.5194/egusphere-egu26-18627, 2026.

EGU26-18853 | PICO | ESSI2.8

The EMODnet Portal: how to present a unified data discovery and download service of complex and diverse European marine data to the public 

Conor Delaney, Joana Beja, Tim Collart, Frederic Leclercq, and Bart Vanhoorne

The European Marine Observation and Data Network (EMODnet) Portal, is the service of DG Mare (Directorate-General of Maritime Affairs and Fisheries, of the European Commission) that provides public access to marine in-situ data and data products on European marine waters across the diverse subject areas of Bathymetry, Biology, Chemistry, Geology, Human Activities, Physics and Seabed Habitats.

The EMODnet network is composed of subject matter experts from both the public and private sectors, that, when producing EMODnet products, collate quality controlled in-situ data and information from a number of sources, ranging from citizen science to EU member state marine observation programmes.

The EMODnet Portal provides various functionalities to discover, visualise, subset and download these outputs. Furthermore, the portal and EMODnet thematics provide standard geospatial web services that can be integrated in applications and external data services.

EMODnet’s backbone is a web-based network of data servers using open-source geospatial server technologies that publish services based on Open Geospatial Consortium web service standards and OPeNDAP (via the ERDDAP API) data service standards. The frontend of the EMODnet is a web-based Map Viewer connected to a GeoNetwork metadata catalogue, which provides more detailed data discovery and downloading via ERRDAP and GeoServer servers that allow subsetting and downloading of vector and gridded data.

The EMODnet Portal is an operational service of DG Mare that is hosted on the European Commission’s Internet domain and therefore, since it is a public facing resource, it must be user friendly in operation, and the data being served must be delivered in a consistent manner and be well described. In addition, the Portal must also comply with the various EU laws about the Internet, e.g. web privacy and web accessibility.

We present the various challenges encountered in building a consistent and accessible service. In particular, we explore the impacts of EU legislation and EC web publication rules on design decisions. and how useful the F.A.I.R principles were as a guide.

How to cite: Delaney, C., Beja, J., Collart, T., Leclercq, F., and Vanhoorne, B.: The EMODnet Portal: how to present a unified data discovery and download service of complex and diverse European marine data to the public, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18853, https://doi.org/10.5194/egusphere-egu26-18853, 2026.

The exponential growth in data generation across scientific domains has amplified the critical role of data science in extracting actionable insights from complex datasets (Chen, 2012; Müller, 2018; Wamba, 2017; Yin, 2015). Traditional data science methodologies, such as the Cross-Industry Standard Process for Data Mining (CRISP-DM) and the Knowledge Discovery in Databases (KDD) process, provide structured frameworks for data processing and model development (Fayyad, 1996; Shearer, 2000). However, these approaches often treat visualization as a terminal step for communicating results rather than as an integral component of the analytical process. Visual analytics addresses this limitation by emphasizing human-computer interaction throughout the analytical workflow, enabling iterative exploration, hypothesis testing, and knowledge generation through interactive visual interfaces (Keim, 2008; Sacha, 2014; Thomas, 2006). Data scientists increasingly rely on computational notebooks for their flexibility in combining code, data, and visualization within unified environments (Chattopadhyay, 2020; Kosara, 2023). However, traditional notebook platforms face significant challenges, including a lack of reproducibility due to execution order dependencies, limited interactivity, difficult version control, and constrained deployment options (Chattopadhyay, 2020). These limitations create friction when transitioning from exploratory analysis to production systems, particularly for visual analytics applications requiring sophisticated interactive visualizations and real-time analytical capabilities (Barik, 2016; Haertel, 2023).

This research investigates the applicability of modern interactive visualization notebooks as comprehensive platforms for end-to-end data science and visual analytics pipelines. The solution artifact employs Marimo, an open-source Python notebook solution that addresses traditional notebook limitations through reactive cell execution and deterministic ordering, as well as a Python-code structure (Kluyver, 2016). The approach integrates multiple technologies, including object storage (e.g., MinIO) for centralized data repositories, analytical databases for efficient data management, and declarative visualization libraries based on Vega and Vega-Lite grammars for flexible interactive graphics (Heer, 2024; VanderPlas, 2018). The methodology is demonstrated through a space weather exploration use case examining the impact of solar activity on Global Navigation Satellite Systems (Su, 2019). The implementation follows the KDD process phases (Fayyad, 1996), beginning with the selection of the NEDM space weather model, which provides three-dimensional electron density estimates based on the F10.7 solar flux index combined with satellite orbital data (Hoque, 2022). The process commences with preprocessing to calculate rolling averages of solar activity indices and to derive satellite identifiers. Following this, transformations are performed to determine satellite positions using Simplified General Perturbations algorithms and to aggregate electron density across spatial grids. Data mining is utilized to create interactive visualizations of visible satellites, including their calculated electron content values. Ultimately, interpretation facilitates interactive selection and recalibration through user-driven dashboard interfaces.

The demonstrator effectively combines data management, processing, and interactive visual analytics into a cohesive notebook environment. This integration fosters streamlined workflows that reduce friction between disparate tools, enhances transparency through documented, reproducible analytical processes (Kosara, 2023), and facilitates real-time interactivity, enabling dynamic parameter adjustments and iterative exploration. Additionally, it provides extensive support for visual analytics that spans the entire knowledge-generation model, from data transformation to insight discovery (Sacha, 2014).

How to cite: Pohl, M. and Reibert, J.: Enhancing Data Science Pipelines through Interactive Environments for Visual Analytics of Spatiotemporal Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19078, https://doi.org/10.5194/egusphere-egu26-19078, 2026.

Semantic segmentation of texture-rich Earth-science imagery (e.g. UAV and outcrop photographs) is common, but supervised segmentation workflows are often assembled from disconnected tools and still rely on labour-intensive, dense pixel-wise annotation. We present SegFlow, an end-to-end pipeline that integrates texture-patch curation, dataset synthesis, model training, experiment tracking, and inference for texture-centric segmentation.

SegFlow defines classes through curated texture patches and generates synthetic training composites and label masks using parameterised, procedural mask generation. This supports rapid model bootstrapping for initial training, reduces the amount of dense pixel-wise annotation required on real imagery, and helps keep label definitions consistent via versioned datasets and repeatable train/validation/test splits in a portable project structure.

For model development, SegFlow includes a PyTorch training interface centred on a configurable U-Net, with various segmentation losses and metrics. Training and inference are organised as scripted, job-based runs that capture data and model provenance in standardised run reports. For assisted segmentation, SegFlow combines the texture-focused U-Net with a prompt-based segmenter (Segment Anything Model, SAM) driven by sparse prompts (points or boxes). In our use case, SAM is helpful for object-like structures, whereas the U-Net is better suited to extended texture regions where prompting can be less stable. Outputs can be refined interactively, and corrected masks can be added back for iterative fine-tuning on real imagery.

We demonstrate the workflow on UAV imagery for geological outcrop mapping (e.g. chalk, glacial till, vegetation) and discuss how provenance tracking, label consistency, and hybrid assistance support reproducible iteration in Earth-science segmentation projects. SegFlow will be made available under GPLv3.

How to cite: Torizin, J. and Schüßler, N.: SegFlow: an end-to-end workflow for texture-centric image segmentation, from texture-patch curation to hybrid assistance, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19144, https://doi.org/10.5194/egusphere-egu26-19144, 2026.

EGU26-19566 | ECS | PICO | ESSI2.8

User-Centric Climate Dashboards for Metrics Evaluation and Temperature Scenario Exploration 

Vikas K Patel, Michelle Cain, and Neil Harris

Climate decision-making increasingly requires tools that can translate complex climate science into easy-to-use information. We develop two open-source, complementary interactive dashboards, designed to support climate understanding across metrics-based assessment and analysis of temperature trajectories under a range of scenarios. The Climate Metrics Decision Dashboard (CMDD) provides a comprehensive yet simple framework for exploring a wide range of climate metrics spanning agriculture, aviation, precipitation, economy, and sea level rise. CMDD is designed to support informed interpretation of diverse metrics without requiring deep domain expertise. It’s a smart guide to navigating the world of climate metrics. It helps researchers, policymakers, and practitioners identify which metric best fits their goals, whether it’s tracking emissions, comparing warming impacts, or assessing progress toward sustainability targets. Instead of getting lost in technical jargon, CMDD helps to learn, compare, and choose all in one place. The dashboard includes thorough descriptions of metrics, guided workflows, recommendations, and accounting of both short-lived and long-lived climate pollutants, enabling users to assess their implications for climate-relevant outcomes.

Taking a similar approach, the FaIR Climate Explorer offers an accessible interface to the FaIR2.2 simple climate model, allowing users to simulate global temperature responses under different Shared Socioeconomic Pathway (SSP) scenarios. By abstracting model complexity behind an intuitive dashboard, the tool enables users with no prior familiarity with FaIR to explore scenario-driven temperature outcomes. Together, these dashboards demonstrate how interactive, user-centric tools can lower barriers to climate analysis while supporting both metrics-based evaluation and scenario-driven temperature exploration. They highlight the potential of dashboard-based approaches to enhance transparency, usability, and decision relevance in climate science and policy contexts.

How to cite: Patel, V. K., Cain, M., and Harris, N.: User-Centric Climate Dashboards for Metrics Evaluation and Temperature Scenario Exploration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19566, https://doi.org/10.5194/egusphere-egu26-19566, 2026.

EGU26-19638 | ECS | PICO | ESSI2.8

G-LEFMTOMO: a Graphical User Interface for performing Local Earthquake Tomography using the FMTOMO code 

Donato Talone, Luca De Siena, and Nicholas Rawlinson

Recent improvements in seismic data acquisition (such as enhanced network coverage, near-real-time analysis, and machine learning data processing) have significantly increased the availability of data. However, due to a lack of time and/or analysts, they are often partially processed and not utilized to their full potential. The rapid development of new tools for analyzing earthquake records can help, but may also decrease the stability achieved from the widespread use of tried and tested software. Additionally, robust codes developed in the literature often rely on efficient but rigid programming languages, such as Fortran, which may not accommodate new and variable data formats. In this context, it becomes crucial to revitalize and enhance existing software by making it more accessible and user-friendly for a broader community across various applications.

One of the possible solutions for addressing this issue is the development of Graphical User Interfaces (GUIs) for terminal-only software. Here, we developed a Python GUI designed to simplify tomography applications using local earthquakes, based on FMTOMO (Rawlinson and Urvoy, 2006), an iterative non-linear Fast-Marching seismic tomography code. Despite its well-documented usage, FMTOMO suffers from the requirement for strictly formatted input files, which are not compatible with the various storage formats commonly used for seismological data. We leverage the toolbox Obspy (Beyreuther et al., 2010) to enable the reading and/or downloading of seismic data in multiple formats and convert it to the FMTOMO format.

The graphical interface, called G-LEFMTOMO, also facilitates the setup process for both the direct and inverse problems by automating repetitive steps that were previously manual. This includes the creation of trade-off curves for tuning damping and smoothing parameters. Additionally, we implemented a feature for the pre-analysis of the source-receiver distribution by generating seismic ray hit-maps before the full tomography process. We also aim to simplify the output format and visualization to facilitate easy sharing of results.

G-LEFMTOMO enables users to manage the entire workflow, from data input to the visualization of tomography models, all within a single interface. For more complex configurations or specific requirements, users can still run the original FMTOMO code through the terminal, allowing the GUI to be utilized for only part of the project if desired.

The introduction of graphical user interfaces in the software community enables scientists to access a wider range of software for data analysis, overcoming the limitations of complex and inflexible software. This development not only expands the resources available to researchers but also enhances the value of raw data, helping to prevent its under-utilization.

References

  Rawlinson, N. and Urvoy, M.: Simultaneous inversion of active and passive source datasets for 3-D seismic structure with application to Tasmania, Geophys. Res. Lett., 33, L24313, https://doi.org/10.1029/2006GL028105, 2006.
   Beyreuther, M., Barsch, R., Krischer, L., Megies, T., Behr, Y., and Wassermann, J.: ObsPy: A Python Toolbox for Seismology, Seismological Research Letters, 81, 530–533, https://doi.org/10.1785/gssrl.81.3.530, 2010.

How to cite: Talone, D., De Siena, L., and Rawlinson, N.: G-LEFMTOMO: a Graphical User Interface for performing Local Earthquake Tomography using the FMTOMO code, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19638, https://doi.org/10.5194/egusphere-egu26-19638, 2026.

EGU26-21357 | PICO | ESSI2.8

RainMerge: Open-Source Software for Unified Merging of Satellite and Gauge Precipitation Data 

Ashish Sharma, Suraj Shah, Yi Liu, and Seokhyeon Kim

Rain gauges provide accurate point-scale precipitation measurements but are often sparsely distributed, particularly in data-scarce and complex-terrain regions. In contrast, satellite and reanalysis precipitation products offer continuous spatial coverage, yet they are affected by retrieval uncertainty and systematic bias. Reliable precipitation estimation therefore requires the integration of gauge observations with gridded satellite products. Existing merging approaches, however, are frequently limited in their ability to directly reconcile point-based gauge measurements with gridded satellite fields and to flexibly incorporate multiple datasets within a single, coherent workflow. We present RainMerge, an open-source, web-based framework that integrates gauge observations with multiple satellite precipitation products using pixel-level, uncertainty-aware merging. The platform automates data acquisition, preprocessing, uncertainty characterization, and merging within a unified computational environment. Through an intuitive graphical interface, RainMerge abstracts technical and geospatial complexity, enabling users without programming expertise to generate research-grade precipitation estimates. By bridging gauge-dependent and gauge-independent merging strategies while improving accessibility through user-oriented software design, RainMerge supports reproducible precipitation data fusion and broadens the practical use of multi-source precipitation merging in hydrological applications.

How to cite: Sharma, A., Shah, S., Liu, Y., and Kim, S.: RainMerge: Open-Source Software for Unified Merging of Satellite and Gauge Precipitation Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21357, https://doi.org/10.5194/egusphere-egu26-21357, 2026.

EGU26-21905 | PICO | ESSI2.8

Earth2Studio: An Open Inference Toolkit for AI Weather Forecasting 

Niall Robinson and the NVIDIA Earth 2

Earth2Studio is an open-source Python toolkit that turns state-of-the-art AI weather and climate models into composable, reproducible workflows that researchers and operators can run and adapt on their own infrastructure.  It targets a key bottleneck in AI-for-weather: the difficulty of moving from standalone model checkpoints to fully integrated forecasting systems that span data, models, uncertainty, and verification.

Earth2Studio provides a unified API for prognostic and diagnostic AI models, heterogeneous data sources, perturbation methods, metrics, and I/O backends, enabling users to assemble end-to-end inference pipelines with only a few lines of code.  The model zoo includes leading global and regional AI forecast models such as Altas, StormScope, GraphCast, Pangu, Aurora, FourCastNet 3, CorrDiff and more. Standardized data interfaces expose operational initial conditions and reanalyses
(e.g. GFS, HRRR, ERA5, IFS) through a shared Xarray-based vocabulary and coordinate system.

Building on the broader Earth-2 initiative, Earth2Studio is designed to cover the entire weather forecasting value chain including AI data assimilation for initial conditions, global medium-range prediction, generative downscaling, and kilometer-scale severe weather nowcsating.  Ensemble-ready perturbation schemes and built-in statistics (RMSE, ACC, CRPS, rank histograms, spread–skill diagnostics) allow consistent quantification of forecast skill and uncertainty across models, lead times, and regions, supporting methodologically robust intercomparison studies.

Released as OSS, Earth2Studio emphasizes openness and sovereignty: all core components are optimised to run on NVIDIA local or cloud platforms, enabling national meteorological services, research institutions, and industry users to integrate proprietary data and maintain ownership over operational chains.

Presented here, are the design principles of Earth2Studio, illustrative exemplar workflows, and a discussion of how this shared software infrastructure can help the EGU community accelerate AI weather research and bridge the gap between experimental models and operationally relevant forecasting systems.

How to cite: Robinson, N. and the NVIDIA Earth 2: Earth2Studio: An Open Inference Toolkit for AI Weather Forecasting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21905, https://doi.org/10.5194/egusphere-egu26-21905, 2026.

EGU26-22316 | ECS | PICO | ESSI2.8

Environmental Decision-Support Tool Evaluation: What Impacts Can Be Measured and How? 

Christina Carrozzo Hellevik, Dina Margrethe Aspen, Christian Klöckner, Erica Margareta Löfström, Ramzi Hassan, and Ricardo da Silva Torres

The complex environmental challenges we face require sound decision-making. As the update on the Planetary Boundaries Framework shows, we are now beyond a safe operating space for humanity. Environmental decision-support tools have the potential to guide decision-makers in addressing such complex challenges and ensuring safety for humans and ecosystems. However, many studies have highlighted a ‘use gap’ and recommend better tool evaluation practices, as these differ greatly across and within disciplines. In this systematic literature review, we investigate how environmental decision-support tools are currently evaluated by considering three types of parameters: tool-user interaction, user impacts, and tool effectiveness. We also systematize the data collection methods used to measure of these parameters. Based on the results, we map the tool-aided decision space and recommend adapted evaluation approaches based on the goals and focus of each study. We further propose a comprehensive framework to guide the choice of decision-support tool evaluation scope and methods.

How to cite: Hellevik, C. C., Aspen, D. M., Klöckner, C., Löfström, E. M., Hassan, R., and da Silva Torres, R.: Environmental Decision-Support Tool Evaluation: What Impacts Can Be Measured and How?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22316, https://doi.org/10.5194/egusphere-egu26-22316, 2026.

EGU26-22713 | ECS | PICO | ESSI2.8

Revisiting the Execution Model of Ensemble Uncertainty Analysis in the Browser 

Christopher Ahn, Juan Ruiz, Jorge Gacitua, Alexandra Diehl, and Renato Pajarola

Ensemble prediction systems are central to modern numerical weather forecasting, providing distributions of plausible atmospheric outcomes rather than single deterministic trajectories. While these ensembles are essential for assessing uncertainty, interactive exploration of ensemble structure, extremes, and spatio-temporal variability remains challenging in practice. Existing workflows rely predominantly on server-centric pipelines—typically Python/Xarray/Dask stacks or VTK-based backends—where computation and rendering occur remotely and the browser functions primarily as a thin client. These architectures introduce latency, require substantial data staging, and often collapse ensembles into low-order summaries that obscure multimodality and extremes.

We present NextSembles, a browser-native system for interactive ensemble uncertainty analysis that relocates data access, statistical computation, and visualization entirely to the client. NextSembles compiles the NetCDF-C library to WebAssembly, enabling standards-compliant NetCDF ingestion directly in the browser. Ensemble variables are decoded into contiguous slabs within WebAssembly linear memory and exposed as typed array views. Statistical reductions—including mean, variance, standard deviation, and probability-of-exceedance—are computed using C/WASM kernels operating directly on this memory, avoiding server round-trips and intermediate data representations.

To maintain responsiveness on large ensemble fields, NextSembles employs a tile-based execution model that subdivides spatial slices into latency-bounded units of work. Tile updates are propagated incrementally to the renderer, enabling progressive visual feedback while preserving full-resolution views. Visualization is performed using VTK-WASM (with WebGPU when available and WebGL fallback), supporting interactive exploration of spatial slices alongside coordinated temporal, distributional, and member-comparison views. A multitrack uncertainty timeline facilitates rapid identification of forecast periods exhibiting elevated ensemble spread.

We evaluate NextSembles on COSMO-1e/2e ensemble datasets, measuring kernel-level performance, end-to-end interaction latency, and data staging costs. Results show that browser-resident C/WASM reducers sustain sub-200 ms interaction latency for common analysis tasks on commodity hardware, enabling responsive, distribution-aware ensemble exploration without reliance on HPC backends or Python services.

NextSembles demonstrates that revisiting the execution model of ensemble uncertainty analysis enables transparent, low-latency workflows directly in the browser, complementing existing server-centric approaches.

How to cite: Ahn, C., Ruiz, J., Gacitua, J., Diehl, A., and Pajarola, R.: Revisiting the Execution Model of Ensemble Uncertainty Analysis in the Browser, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22713, https://doi.org/10.5194/egusphere-egu26-22713, 2026.

EGU26-1692 | ECS | PICO | ESSI2.10

A Geospatial Database for Monitoring Arctic Coastline Dynamics:  Mapping Shoreline Change in Iceland and Svalbard 

Sosanna Kapsokefalou, Panagiota Gartagani, Eleftherios Plessas, Christos Nakos, Iakovos Kafouris, Nektarios G. Tselos, Spyridon E. Detsikas, Antonis Litke, and George P. Petropoulos

Arctic coastlines are highly sensitive to climate change, resulting in significant environmental and socioeconomic impacts. Monitoring changes in these regions is essential for understanding long-term dynamics, identifying the most vulnerable coastlines, and providing critical information to local and international stakeholders for effective climate adaptation strategies. Geoinformation technologies, particularly Earth Observation (EO) and Geographic Information Systems (GIS), have emerged as transformative solutions for tracking coastal changes across remote and data-scarce Arctic areas.

To this end, this study aims to exploit EO and GIS for developing a geospatial database that examines how coastlines in the Svalbard Archipelago and Iceland have evolved over the last four decades (1985–2025). Anniversary dates of Landsat satellite imagery were used, while all necessary preprocessing and imagery acquisition were performed in Google Earth Engine cloud platform. Coastline digitization was carried out in ArcGIS Pro using the direct photointerpretation method. In Svalbard, shoreline changes are mainly linked to ice-related processes, such as variations in ice cover. In Iceland, coastline variability reflects a combination of geomorphological processes and localized anthropogenic activity, particularly in areas affected by port development

Τhe database developed herein supports the continued monitoring of Arctic coastal environments and  offer a foundation for further investigation of coastal change in both natural and human-influenced parameters. The produced geospatial datasets provide a baseline for future analyses contributes to relevant efforts for developing open access geospatial datasets that such as the EO-PERSIST platform.  Such endeavors provide fertile ground to researchers and policymakers to better understand coastal dynamics and support evidence-based decision-making for Arctic coastal management.

Keywords: Arctic coastlines; Shoreline change; Earth Observation; Landsat; Geoinformatics; EO-PERSIST

Acknowledgement

The present research study is supported by the project “EO-PERSIST”, funded by the European Union’s Horizon Europe research and innovation program (HORIZON-MSCA-2021-SE-01-01, under grant agreement no. 101086386

How to cite: Kapsokefalou, S., Gartagani, P., Plessas, E., Nakos, C., Kafouris, I., Tselos, N. G., Detsikas, S. E., Litke, A., and Petropoulos, G. P.: A Geospatial Database for Monitoring Arctic Coastline Dynamics:  Mapping Shoreline Change in Iceland and Svalbard, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1692, https://doi.org/10.5194/egusphere-egu26-1692, 2026.

EGU26-4239 | PICO | ESSI2.10

Landsat monitoring reveals the history of river organic pollution across China during 1984-2023 

Dong Liu, Nuoxiao Yan, Zhiqiang Qiu, Chenxue Zhang, and Yao Yan

River organic pollution exhibits pronounced spatiotemporal dynamics in response to environmental changes. However, the traditional method of tracking chemical oxygen demand (COD) and/or other organic pollution indicators at fixed locations over expansive regions is labor-intensive, time-consuming, and inadequate for achieving full spatial coverage. To address this limitation, here we developed a Random Forest algorithm using Landsat satellite data in conjunction with sub-daily (every 4 hours) COD data at 1,997 sites across China. The proposed model achieved high accuracy, with a root mean square error of 0.52 mg/L and a mean absolute percent difference of 13.01%. Additionally, the model was robust across clear, algae-laden, turbid, and black-smelling waters. Then, the algorithm was applied to investigate the spatiotemporal variations of COD concentration in Chinese rivers during 1984-2023. Across China, high river COD concentrations were observed in the eastern Songliao (3.56 ± 1.11 mg/L), Haihe (3.00 ± 0.89 mg/L), and Huaihe (3.57 ± 0.67 mg/L) basins. Anthropogenic activities could explain 79.39% of the spatial variability in COD concentrations, and the cropland distribution had a significant impact. During 1984-2023, 73.58% of China's rivers exhibited significant changes in COD concentrations (p < 0.05). With respect to the 800 mm isoprecipitation line, 56.62% of the southeastern rivers showed decreasing trends; in contrast, 84.25% of the northwestern rivers displayed increasing trends in COD concentrations. The temporal variations in COD concentrations were driven by the combined effects of factors including rainfall, vegetation coverage, and human activities; their relative contributions were 0.02 – 42.45%, 0.07 – 68.76%, and 0.06 – 90.31% for COD changes in different provinces. This study underscores the advantages of using satellite data to efficiently and dynamically monitor organic pollution in river systems, providing crucial technical and data support for such monitoring efforts on a large scale.

How to cite: Liu, D., Yan, N., Qiu, Z., Zhang, C., and Yan, Y.: Landsat monitoring reveals the history of river organic pollution across China during 1984-2023, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4239, https://doi.org/10.5194/egusphere-egu26-4239, 2026.

EGU26-6483 | ECS | PICO | ESSI2.10

Deep Learning Based Glacial Lake Mapping in the Austrian Alps 

Jakko-Jan van Ek and Daniel Hölbling

Climate change causes significant glacial retreat in the Austrian Alps. Glacial retreat is linked to an increase in the number and size of glacial lakes. The emergence and growth of glacial lakes threatens alpine infrastructure and can cause glacial lake outburst floods (GLOF's). Therefore, it is important to monitor the spatio-temporal evolvement of glacial lakes. Remote sensing provides possibilities for cost-effective glacial lake monitoring. Besides, in recent years various deep learning-based models have been introduced as effective tools for computer vision tasks, including semantic segmentation.
In this study, a Unet-based semantic segmentation model architecture has been implemented, trained and assessed on a custom training dataset.
The dataset is based on an inventory of Austrian glacial lakes in 2015 and Sentinel-2 imagery and contains 386 image chips, 270 for training and 116 for testing, each measuring 512 by 512 pixels and including at least one glacial lake from the inventory.

The semantic segmentation model has been applied to a time series of Sentinel 2 imagery from 2015 to 2025 in order to create a time series of glacial lake maps. The final results will be used to examine whether the number of glacial lakes has increased in recent years and to examine spatio-temporal trends in glacial lake evolution. Potential impacts of the observed developments on (hiking) infrastructure will also be assessed.

The geospatial workflow is implemented using open-source tools and freely available datasets. 

How to cite: van Ek, J.-J. and Hölbling, D.: Deep Learning Based Glacial Lake Mapping in the Austrian Alps, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6483, https://doi.org/10.5194/egusphere-egu26-6483, 2026.

EGU26-7381 | ECS | PICO | ESSI2.10

Reconstructing Annual Dominant Tree-Species Distributions (1986–2024) in the Qilian Mountains via Multi-sensor Sample Transfer and Random Forest 

Jianye Yu, Yunfei Li, Fen Zhang, Chongshan Wang, Zibo Wang, and Xiaohua Gou

Long-term tree-species information is fundamental for quantifying ecosystem services and forest climate resilience, yet multi-decadal mapping is often constrained by sparse field samples and heterogeneous satellite archives. Here we present a sensor-agnostic sample-transfer pipeline that reconstructs annual dominant tree-species distributions in the Qilian Mountains (north-eastern Tibetan Plateau) from a single-year field training dataset.

We harmonize optical missions (Landsat-5/7/8 and Sentinel-2) with multi-frequency SAR archives (ERS and Envisat C-band, Sentinel-1 C-band, and ALOS PALSAR L-band) and build a unified annual feature space combining spectral variables, SAR backscatter metrics, terrain predictors, and phenological descriptors. Phenology is derived from NDVI time series using Harmonic Analysis of Time Series (HANTS), yielding noise-robust seasonal metrics that remain comparable across sensors and years. To overcome the absence of historical labels, we transfer class labels from 6,268 field samples to each target year through a dual-constraint similarity screening: (i) feature-vector magnitude (Euclidean distance) and (ii) feature-vector direction (cosine similarity / spectral-angle-based measure). A thresholding rule discards ambiguous points and retains only reliable migrated samples. Annual maps are then generated using a Random Forest classifier (bagging and majority vote), while class imbalance is mitigated via downsampling and SMOTE.

Across nine sensor-integration periods spanning 1986–2024, the sample-transfer component remains stable despite changing sensors (transfer accuracy: 86.0–94.4%), and the resulting tree-species classification maintains consistently high accuracy (95.7–98.8%). Year-by-year assessments indicate overall accuracy and Kappa typically above 0.95 for most years and classes; performance reductions are mainly confined to rare taxa with limited observations. The final products provide consistent annual maps for six dominant tree genera (Betula, Juniperus, Picea, Populus, Pinus, and Larix) together with shrub–grass vegetation, cropland, water, and bare land, enabling robust quantification of multi-decadal changes in area fractions, spatial patterns, and centroid migration.

By coupling multi-sensor feature harmonization, HANTS-based phenology, and a dual-constraint sample-transfer strategy, this workflow offers a practical and generalizable route to recover multi-decadal tree-species dynamics from limited field data in mountain ecological barrier regions.

How to cite: Yu, J., Li, Y., Zhang, F., Wang, C., Wang, Z., and Gou, X.: Reconstructing Annual Dominant Tree-Species Distributions (1986–2024) in the Qilian Mountains via Multi-sensor Sample Transfer and Random Forest, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7381, https://doi.org/10.5194/egusphere-egu26-7381, 2026.

EGU26-7579 | ECS | PICO | ESSI2.10

GEDI-constrained forest mapping and Potential–Realized climate limits of canopy height, biomass and structural complexity in the upper Yellow River 

Chongshan Wang, Jianye Yu, Yunfei Li, Fen Zhang, Zibo Wang, and Xiaohua Gou

The upper Yellow River is a nationally important water-conservation region where forest three-dimensional structure supports carbon storage and hydrologic buffering. In this topographically complex forest–shrub–grass mosaic, watershed-scale constraints of forest structure by climate remain poorly quantified, and uncertainty in forest extent can propagate into subsequent structural mapping and attribution. We produced a GEDI (Global Ecosystem Dynamics Investigation) height-constrained hierarchical forest mask and generated 10–30 m forest cover with probability and quality-assurance layers for the upper Yellow River.

Using the Taohe River Basin as a representative catchment, UAV LiDAR benchmarks were upscaled with Sentinel-1/2 time-series metrics, topographic predictors and GEDI information to derive 30 m wall-to-wall canopy height, aboveground biomass (AGB) and canopy entropy, the latter representing vertical structural complexity. Independent evaluation indicates reliable performance (R² ≈ 0.71–0.84; canopy entropy R² = 0.836).

Forest structure–climate relationships were examined with a Potential–Realized framework. Climate-constrained structural potential was estimated using conditional upper-quantile models, and climatic limitation was quantified as the departure of realised structure from its potential. Hydrothermal thresholds are well defined: canopy height potential peaks at 2–4 °C mean annual temperature (≈ 25 m) and approaches saturation beyond 540–560 mm annual precipitation; the joint optimum for canopy height and AGB occurs under ~450–550 mm precipitation and 5–8 °C mean temperature; canopy entropy maximises near 2–4 °C, 450–550 mm, aridity index ≈ 1.2, and potential evapotranspiration ≈ 600 mm. Using downscaled CMIP6 projections and derived vapour pressure deficit (VPD), we quantify potential contraction and overshoot risk (current structure/future potential) and provide analysis-ready layers for climate-risk screening in water-source regions.

How to cite: Wang, C., Yu, J., Li, Y., Zhang, F., Wang, Z., and Gou, X.: GEDI-constrained forest mapping and Potential–Realized climate limits of canopy height, biomass and structural complexity in the upper Yellow River, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7579, https://doi.org/10.5194/egusphere-egu26-7579, 2026.

EGU26-9658 | ECS | PICO | ESSI2.10 | Highlight

AI Foundation Models for Near Real-Time Environmental Monitoring from Satellite Data  

Bartłomiej Ostrowski, Jędrzej S. Bojanowski, Marcin Kluczek, and Mikolaj Czerkawski

Embeddings provide a compact representation of data in a lower-dimensional vector space, enabling faster and more efficient analysis compared to direct processing of high-dimensional data. Satellite imagery is an example of such data, as it is characterized by large volume and high dimensionality. With the rapid development of AI foundation models, embedding-based approaches can increasingly replace classical remote sensing techniques in tasks such as classification and regression, while maintaining or even improving the quality of results.

This work leverages the Global Embeddings Dataset from the Copernicus Data Space Ecosystem, which contains embeddings generated by multiple models, including SSL4EO DINOv2, SigLIP, DeCUR, and MMEarth. These models differ in sensing modality, input resolution, and embedding dimensionality, enabling diverse analyses based on heterogeneous data sources. Data standardization using the MajorTOM format facilitates automated processing and seamless integration of embeddings derived from different models. 

Following the MajorTOM standard, more than 8 million images, comprising 9.368 trillion pixels of raw data, were processed to generate over 170 million embeddings from approximately 62 terabytes of satellite data. This scale demonstrates the feasibility of embedding-based approaches for efficient management and analysis of large-scale Earth observation datasets. 

Embedding-based representations enable effective detection of environmental changes, which can be categorized as either abrupt events, such as wildfires, deforestation, or floods, or long-term processes, including river desiccation and gradual ecosystem degradation. Such change detection capabilities are applicable across multiple domains, including urban development, defense, and environmental monitoring. By operating on compressed representations, embeddings allow for efficient similarity and change analysis over temporal sequences, significantly accelerating the processing of large satellite data archives. 

 

How to cite: Ostrowski, B., Bojanowski, J. S., Kluczek, M., and Czerkawski, M.: AI Foundation Models for Near Real-Time Environmental Monitoring from Satellite Data , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9658, https://doi.org/10.5194/egusphere-egu26-9658, 2026.

Semi-enclosed seas are particularly sensitive to regional climate forcing because their exchange with the open ocean is limited. Despite this sensitivity, the magnitude of their warming relative to global climate model projections remains insufficiently constrained. This study examines both historical and future sea surface temperature (SST) changes in the Mediterranean, Aegean, Marmara, and Black Seas surrounding Türkiye using a hybrid framework that integrates in-situ observations, satellite reanalysis, and machine learning techniques.

More than three decades (1993–2023) of monthly coastal SST records from 21 stations are analysed together with Copernicus Marine Environment Monitoring Service (CMEMS) reanalysis data. Linear trend analysis reveals statistically significant warming across all basins, with observed SST increases reaching approximately 2.0 °C since 1993. When placed in a global context, these regional warming rates are approximately 1.5–2 times higher than the CMIP6 ensemble mean global ocean warming over a comparable period.

Future SST evolution is explored using a Long Short-Term Memory (LSTM) model trained on bias-corrected SST time series and supplemented with large-scale climate indices, namely ENSO and NAO. Comparison with observations shows that the model reproduces SST variability with a high level of agreement (R² > 0.9; RMSE ≈ 0.4–0.6 °C), while the projected trajectories remain physically plausible under both SSP2-4.5 and SSP5-8.5 scenarios. The projections point to a persistent warming signal throughout the 21st century, with the strongest increases concentrated in the Mediterranean and Aegean Seas.

Taken together, these results suggest that SST warming in Türkiye’s semi-enclosed seas is higher than coarse-resolution CMIP6 ensemble mean global ocean warming estimates. This finding emphasises the importance of regionally resolved observations and data-driven analyses for coastal climate assessment. The hybrid framework applied here offers a scalable approach for monitoring semi-enclosed marine systems and for informing climate-related decision-making at regional scales.

How to cite: Erkoç, M. H.: Are Türkiye’s Semi-Enclosed Seas Warming 1.5–2 Times Faster Than CMIP6 Global Projections?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14132, https://doi.org/10.5194/egusphere-egu26-14132, 2026.

EGU26-18379 | PICO | ESSI2.10

Change detection of tree cover and burned areas at apiary sites in the Kwahu Afram Plains, Ghana 

Byongjun Hwang, Chris Keywood, and Janet Lowore

Bees for Development (BfD) and its local partners have supported beekeeping initiatives in the Kwahu Afram Plains, Ghana, since 2019, with the objective of promoting forest conservation and empowering beekeepers to reduce drivers of forest loss while enhancing forest recovery. Evaluating the conservation results of such community-based interventions requires independent, spatially explicit evidence of environmental outcomes. Key indicators of success include the absence of tree loss and burned areas within a 250 m radius of established apiary sites.

While satellite-based change detection of tree cover and burned areas is well established at regional and global scales, fine-scale monitoring at localized, spatially distributed sites remains relatively understudied and methodologically challenging. Such analyses require careful calibration and validation to detect subtle changes at small spatial extents. In this study, we assess the performance of multiple change detection algorithms for monitoring tree cover loss and fire disturbance within a 500 m diameter surrounding apiary locations, with a focus on fine-scale detection.

We integrate multi-sensor satellite data from Landsat, Sentinel-1, and Sentinel-2, and apply statistical time-series approaches including AVOCADO and BEAST. Algorithm performance is evaluated using high-resolution reference data from PNEO, WorldView, and Google Earth imagery, complemented by field-based ground observations. The study investigates different change detection methods, especially for localized, fine-scale conservation impact assessment and provides insight into practical and independent monitoring frameworks for community-led  conservation initiatives.

How to cite: Hwang, B., Keywood, C., and Lowore, J.: Change detection of tree cover and burned areas at apiary sites in the Kwahu Afram Plains, Ghana, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18379, https://doi.org/10.5194/egusphere-egu26-18379, 2026.

EGU26-19609 | PICO | ESSI2.10

Identifying Earth System Features from Satellite Data  

Anna-Lena Erdmann, Roope Tervo, Gerrit Holl, Armagan Karatosun, Roger Huckle, Joerg Schulz, Alexander Halbig, Frank Hogervorst, and Luca Brugaletta

Reliable feature identification, long time series of identified features, and tools to explore them provide substantial benefits for weather forecasting, process understanding, climate information provision, and the evaluation of climate model outputs. Moreover, expert use of these features within an established feedback loop enables the creation of high-quality training datasets for further application development and for training machine learning (ML) models.  

EUMETSAT and its Member States are building a collaborative environment for joint manual or ML-assisted annotation, model development, and the database of identified features within the European Weather Cloud (EWC) to support these developments. The EWC is a cloud-based collaboration platform for meteorological application development and operations in Europe, and to enable the digital transformation of the European Meteorological Infrastructure.

EUMETSAT is compiling a database of long time series of meteorological features identified from various satellite datasets. This database will support the analysis of the development and interrelationships of these features, enabling new insights for both nowcasting and climate science. For example, the Deutscher Wetterdienst plans to use this environment to characterise convective storms using FCI-derived cloud-top features—such as overshooting tops—for nowcasting, with the primary aim of training an ML algorithm to automatically identify storm tops. 

Initial work has begun on identifying tropical storms from long Himawari time series, alongside a feasibility study on additional feature types. Early results from the feasibility study, demonstrating the potential for performing feature identification on long time series of Earth-observation data, will be presented. 

The joint working environment is available in the EWC, which is open to authorised users from ECMWF and EUMETSAT Member and Co-operating States for official duties and R&D projects. It consists of data-proximate cloud infrastructure, alongside the EWC Community Hub, which enables collaborative development, code and ML model sharing, and the exploitation of meteorological applications. 

How to cite: Erdmann, A.-L., Tervo, R., Holl, G., Karatosun, A., Huckle, R., Schulz, J., Halbig, A., Hogervorst, F., and Brugaletta, L.: Identifying Earth System Features from Satellite Data , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19609, https://doi.org/10.5194/egusphere-egu26-19609, 2026.

EGU26-884 | ECS | PICO | HS6.10 | Highlight

Advancing near-real-time water-quality monitoring in Brazil through remote sensing: the MAPAQUALI and MAPAQUALI-IA platforms 

Daniel Maciel, Claudio Barbosa, Evlyn Novo, Rogério Flores Júnior, Aurea Ciotti, Felipe Lobo, Fernando Lopes, Gilberto Ribeiro, Maurício Noernberg, Rogério Marinho, and Vitor Martins

Monitoring water quality in inland and coastal waters is essential for understanding biogeochemical cycles and the impacts of anthropogenic pressures such as land use and land cover change, mining, deforestation, dam construction, and climate change. Traditional field surveys, conducted bimonthly or quarterly at limited sampling stations, are valuable but do not offer the spatial and temporal coverage necessary to fully support public policies for sustainable aquatic system management. In this context, remote sensing plays a key role in enabling large-scale water quality monitoring across extensive and remote regions, such as Brazilian Amazon. Despite recent advances, there remains a lack of accessible platforms that deliver validated remote sensing products and algorithms to researchers, stakeholders, and decision-makers. To address this gap, the Instrumentational Laboratory for Aquatic Ecosystems (LabISA) at the Brazilian National Institute for Space Research (INPE) is developing MAPAQUALI, a semi-automatic cloud-based platform designed to generate and distribute water quality products at high spatial and temporal resolution for aquatic ecosystems in Brazil. MAPAQUALI integrates a set of semi-analytical and machine-learning algorithms developed and validated by INPE’s research team. These algorithms retrieve key water quality parameters, including chlorophyll-a, phycocyanin, Secchi disk depth, and total suspended solids, using observations from ESA and NASA multispectral sensors (Sentinel-2 MSI, Sentinel-3 OLCI, and Landsat-8/9 OLI) with a focus on specific reservoirs and lakes in Brazil. In addition to the MAPAQUALI, a new project named MAPAQUALI-IA is leveraging large-scale mapping of water quality in Brazil using artificial intelligence (i.e., machine learning and deep learning methods) to provide these water quality parameters using a single large-scale algorithm. The project will develop algorithms with the help of newly released open datasets, such as BRAZA and GLORIA. The MAPAQUALI/MAPAQUALI-IA processing pipeline incorporates advanced aquatic atmospheric correction techniques, specifically ACOLITE and 6SV, as well as corrections for glint and adjacency effects. A STAC-compliant data cube environment (Brazil Data Cube platform) allows to generate and store data enabling rapid access, visualization, and analysis. This publication introduces the current MAPAQUALI/MAPAQUALI-IA prototype, a modular and continuous monitoring system implemented for representative Brazilian aquatic environments, including Amazonian lakes, eutrophic cascade reservoir system, and coastal waters. Future developments will expand sensor compatibility, include new water-quality algorithms, and extend coverage to additional inland and coastal environments. Ultimately, MAPAQUALI aims to bridge the gap between scientific data and operational application, supporting more informed decision-making to improve aquatic ecosystem conservation and management in Brazil.

How to cite: Maciel, D., Barbosa, C., Novo, E., Flores Júnior, R., Ciotti, A., Lobo, F., Lopes, F., Ribeiro, G., Noernberg, M., Marinho, R., and Martins, V.: Advancing near-real-time water-quality monitoring in Brazil through remote sensing: the MAPAQUALI and MAPAQUALI-IA platforms, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-884, https://doi.org/10.5194/egusphere-egu26-884, 2026.

EGU26-1327 | ECS | PICO | HS6.10

A Multi-Sensor Remote Sensing Framework to Track Agricultural Drought in the Souss-Massa Basin, Morocco 

Soumia Gouahi, Mohammed Hssaisoune, El Houssaine Bouras, Yassine Ait Brahim, and Lhoussaine Bouchaou

Agricultural drought is a growing concern in Morocco, especially in the Souss-Massa basin, where the economy is heavily reliant on increasingly limited water resources. As climate As climate variability intensifies and groundwater levels decline, traditional drought monitoring tools (mainly based on rainfall alone) no longer provide a comprehensive representation of the evolution of water stress in crops and soils. Based on remote sensing, we developed a framework to facilitate understanding of these intricate interactions.

We combine four satellite indicators to reflect different aspects of drought stress: vegetation greenness (VCI), land surface temperature (TCI), soil moisture availability (SMCI), and photosynthetic activity (GPP anomaly). These datasets, derived from MODIS and ESA-CCI products, were processed into a consistent time series from 2000 to 2023. Utilising the seasonal Standardized Precipitation Evapotranspiration Index (SPEI-6) as a reference, we trained a Random Forest model to generate a Remote Sensing Drought Index (RSDI) specifically tailored to the wheat-growing season in the Souss-Massa basin.

The developed index demonstrates robust performances across the region, effectively capturing both rapid shifts in meteorological conditions and the slower cumulative effects of water stress on vegetation.

The model exhibits strong predictive accuracy (R² ≈ 0.75) and remains stable even when applied to stations not utilized during the training process.

Importantly, the RSDI aligns closely with observed wheat yield anomalies (r ≈ 0.9), indicating its relevance for agricultural decision-making. The framework also reproduces major drought years, such as 2015–2016 and 2023–2024, revealing clear spatial contrasts linked to topography and irrigation patterns.

The combined use of multiple remote-sensing indicators provides a reliable measure of drought evolution and supports regional actors in planning and managing water and agricultural activities under growing climatic pressure.

How to cite: Gouahi, S., Hssaisoune, M., Bouras, E. H., Ait Brahim, Y., and Bouchaou, L.: A Multi-Sensor Remote Sensing Framework to Track Agricultural Drought in the Souss-Massa Basin, Morocco, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1327, https://doi.org/10.5194/egusphere-egu26-1327, 2026.

A widespread decline in dissolved oxygen (DO) has been observed in rivers, temperate lakes and oceans, yet the impacts of climatic warming on global lake deoxygenation remain unclear. Here, we train data-driven models using climatic data, satellite images and geographic factors to reconstruct surface DO and quantify the climatic contribution to DO variations in 15,535 lakes from 2003 to 2023. Our analysis indicates a continuous deoxygenation in 83% of the studied lakes. The mean deoxygenation rate in global lakes (-0.049 mg/L/decade) is faster than that observed in the oceans (-0.022 mg/L/decade) and in rivers (-0.038 mg/L/decade). By decreasing solubility, climatic warming contributes 55% of global lake deoxygenation. Meanwhile, heatwaves exert rapid influences on DO decline, resulting in a 7.7% deoxygenation compared to that observed under climatological mean temperatures. By the end of the century, global lake DO is projected to decrease by 0.41 mg/L (4.3%) under SSP2–4.5 and 0.86 mg/L (8.8%) under SSP5–8.5 scenarios.

How to cite: Zhang, Y.: Climate Warming and Heatwaves Accelerate Global Lake Deoxygenation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2496, https://doi.org/10.5194/egusphere-egu26-2496, 2026.

The Landsat Collection 2 (C2) archive is vital for inland water monitoring, yet the Land Surface Reflectance Code (LaSRC) atmospheric correction for Landsat-8/9 introduces Dark Plume Over Water (DPOW) artifacts. These aerosol extrapolation errors cause severe negative biases in shortwave bands, disrupting long-term consistency. To address this, we developed a cloud-native Alternative Correction (AC) method on Google Earth Engine. This data-driven approach employs random forest regression, trained on spatiotemporally aggregated high-quality water pixels, to reconstruct reliable surface reflectance (SR) from Top-of-Atmosphere observations. Validation against a global in-situ hyperspectral dataset and benchmarking against the physics-based ACOLITE processor demonstrate the robustness of the proposed method. While ACOLITE effectively resolves the negative bias issue, the AC method achieves superior radiometric accuracy, reducing the ultra-blue Root Mean Square Error to 0.019 (compared to 0.029 for ACOLITE and 0.031 for C2 SR). Notably, under high-aerosol conditions, the AC method minimizes the residual spectral distortions often observed in physical inversions, effectively restoring the natural spectral shape. Spatially, the method eliminates DPOW artifacts; furthermore, it removes systematic biases between Landsat-8/9 and legacy sensors (Landsat-4/5/7). By restoring radiometric integrity, this automated solution secures the foundation for reliable long-term global limnology.

How to cite: Bi, S., Shi, K., and Xu, J.: A cloud-native alternative correction for Landsat-8/9 Collection 2 surface reflectance over inland waters, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3651, https://doi.org/10.5194/egusphere-egu26-3651, 2026.

EGU26-5592 | ECS | PICO | HS6.10

Monte Carlo–Based Uncertainty Propagation for Probabilistic Water Masking from Satellite Remote Sensing Reflectance Product 

Gomal Amin, Olivier Pourret, Victor Dupin, Sabrina Guérin-Rechdaoui, and Arnaud Dujany

Accurate and reliable delineation of surface water from optical satellite imagery is a pre-requisite for many hydrological applications. In inland and riverine environments, water masking is a major source of uncertainty due to optically complex waters, mixed land–water pixels and strong adjacency effects. Conventional water masks provide no explicit measure of classification confidence and do not account for uncertainty in pixel classification, particularly along riverbanks, under bridges, in building-shadows, and in highly dynamic systems, where small changes in reflectance or threshold parameters can lead to unstable water boundaries.

In this study, we present a generic Monte Carlo–based water detection framework that explicitly propagates Sentinel-2 ACOLITE remote sensing reflectance uncertainty through multiple spectral threshold-based water indices (NDWI, MNDWI, AWEI, and MBWI), resulting in per-pixel water occurrence probabilities. These indices are evaluated independently and combined using a deterministic voting-based fusion scheme. This decision logic is further constrained by physically motivated reflectance thresholds in the near-infrared and shortwave infrared bands, together with a low-signal filter, to suppress shadows and dark non-water surfaces that commonly generate false positives in index-based approaches.

The method is demonstrated as a proof of concept using a Sentinel-2 acquisition over the Seine River in Paris characterized by complex optical conditions. High-confidence water pixels dominate the main river channel, while intermediate probabilities are concentrated along riverbanks, bridges, and narrow tributaries. Within the final detected water mask, the mean water probability reaches 0.98, with more than 97% of water pixels classified with high confidence (P ≥ 0.9). Classification uncertainty is very low overall, indicating strong consistency across Monte Carlo realizations. Intermediate probabilities (0.3 < P < 0.7) represent less than 1% of detected water pixels and are spatially confined to water–land transition zones. Sensitivity experiments indicate that total water extent is weakly affected by increasing reflectance perturbation, whereas uncertainty increases systematically at water–land boundaries. By explicitly quantifying water-detection uncertainty, this Monte Carlo framework provides a statistically robust foundation for subsequent water-quality retrieval and uncertainty propagation.

How to cite: Amin, G., Pourret, O., Dupin, V., Guérin-Rechdaoui, S., and Dujany, A.: Monte Carlo–Based Uncertainty Propagation for Probabilistic Water Masking from Satellite Remote Sensing Reflectance Product, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5592, https://doi.org/10.5194/egusphere-egu26-5592, 2026.

River discharge estimation is critical for flood forecasting and water resources management, yet traditional gauging methods are often limited in spatial coverage. Accurate estimation of river discharge from satellite observations remains challenging in large rivers where hydraulic controls and anthropogenic disturbances induce non-stationary width–discharge relationships. In this study, multi-decadal river width time series derived from multi-sensor satellite imagery (Landsat-5/7/8 and Sentinel-1/2) were employed to estimate discharge in the Ganjiang River Basin, China, using the Google Earth Engine (GEE) platform. Particular emphasis was placed on quantifying the impacts of backwater effects and channel morphological changes on inversion accuracy. Results indicate that: (1) satellite-based width–discharge scaling performs robustly in morphologically stable reaches, yielding high accuracy at the Ji’an, Xiajiang, and Zhangshu stations (R2 > 0.92$; NSE > 0.90); (2) in contrast, performance at the Waizhou station is strongly degraded by complex hydromorphological dynamics, where intensified backwater effects from Poyang Lake during the wet season weaken the functional coupling between river width and discharge (R2 decreases to 0.59), and pronounced channel incision associated with historical sand mining (mean bed lowering of 2.97 m) introduces additional non-stationarity into the rating relationship; and (3) to account for these time-varying controls, a segmented modeling framework was implemented to explicitly reflect periods of morphological adjustment, substantially improving discharge estimates at Waizhou and increasing both R2 and NSE to 0.90 from 2012 to 2019. These findings highlight the importance of considering morphodynamic evolution and hydraulic boundary conditions explicitly for reliable satellite-based discharge estimation in dynamically evolving river–lake systems.

How to cite: Wang, G. and Gu, P.: Satellite-Based Discharge Estimation in Morphologically Dynamic Rivers: A Segmented Modeling Approach for the Ganjiang River Basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6227, https://doi.org/10.5194/egusphere-egu26-6227, 2026.

Accurate quantification of irrigation water use efficiency (IWUE) is essential for sustainable oasis agriculture in arid regions with multi-source water supply. Examining the Yongji Irrigation Area of the Hetao Irrigation District, this study developed a Surface Energy Balance System (SEBS)-based evapotranspiration (ET) model for the area using Landsat-8 imagery and 2023 meteorological data. Model performance was evaluated against ground-based observations. SEBS-derived ET was coupled with a regional water balance approach to estimate IWUE under multi-source irrigation and compared with the conventional canal water balance method. The main findings are as follows: (1) in 2023, remotely sensed ET totaled 5.4×108m3, corresponding to an IWUE of 0.426; (2) SEBS-retrieved daily ET agreed well with in situ observations at the Linhe Meteorological Station on seven dates between April and October 2023, with a coefficient of determination R2 = 0.816 , root-mean-square error (RMSE) of 0.714 mm/day, mean absolute error (MAE) of 0.703mm/day, and bias of −0.337mm/day, confirming that SEBS reliably captures daily ET dynamics in the Yongji Irrigation District; (3) the SEBS-based IWUE differed by 6.33% from the traditional canal water balance estimate (0.455), suggesting good consistency between the two approaches. These findings indicate that SEBS-based remote sensing can provide spatially explicit, operational assessments of irrigation efficiency and support precision water resources management in arid and semi-arid agricultural regions.

How to cite: Xu, Y. and Feng, S.: Estimation and Analysis of Irrigation Water Use Efficiency in Multi-Source Irrigation Areas of Arid Regions Based on the SEBS Model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6814, https://doi.org/10.5194/egusphere-egu26-6814, 2026.

EGU26-10071 | ECS | PICO | HS6.10

Reconstructing the Hydrological Cycle of the Ebro River Basin through Satellite Observations 

Sindhu Kalimisetty, Serena Ceola, Irene Palazzoli, Alberto Montanari, Paolo Stocchi, and Stefania Camici

The Ebro River Basin is one of the most intensively managed and climatically sensitive basins in the Mediterranean region, where increasing water demands, pronounced climate variability, and environmental constraints pose major challenges for sustainable water resources management. Addressing these challenges requires hydrological models capable of consistently representing both natural processes and anthropogenic water use. In this context, the INTERROGATION project, funded by the Italian Ministry of Universities and Research, examines the interactions between climatic and anthropogenic factors in the development and recovery of major hydrological droughts that have affected the Ebro River Basin in recent decades (1990-2023).

In this study, we present a comprehensive reconstruction of the water cycle in the Ebro River Basin, explicitly accounting for both natural processes and human water use. For this purpose, three different precipitation datasets are used as input data to the flexible conceptual hydrological model MISDc (Modello Idrologico Semistribuito in Continuo): long-term (2000-2023) daily in situ observations and two versions of a daily integrated dataset obtained by merging GPM and SM2RAIN products at low (10 km) and high (1 km) spatial resolutions.

The hydrological model is calibrated against observed river discharge and validated through a multi-variable comparison with satellite-based estimates of soil moisture, evapotranspiration, snow water equivalent, and irrigation, which were developed within the framework of the European Space Agency Digital Twin Earth (DTE) Hydrology Next project. The results of this work demonstrate the significance of employing a suitable hydrological model in conjunction with accurate satellite information for capturing the spatiotemporal evolution of the hydrological cycle within highly managed basins. These results will be the basis for developing a decision support system that will guide stakeholders toward an integrated management of water resources in the Ebro River Basin.

How to cite: Kalimisetty, S., Ceola, S., Palazzoli, I., Montanari, A., Stocchi, P., and Camici, S.: Reconstructing the Hydrological Cycle of the Ebro River Basin through Satellite Observations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10071, https://doi.org/10.5194/egusphere-egu26-10071, 2026.

EGU26-11687 | ECS | PICO | HS6.10

High-resolution crop map generation in Mediterranean environments using IOTA2 chain 

Andrea Borgo, Vincent Thierion, Antonio Trabucco, Flavio Lupia, and Marta Debolini

Reliable crop mapping is essential in land and water management studies to understand the spatial distribution and dynamics of agricultural practices, to model resource use and production, and to propose sustainable scenarios for agricultural and water management. This work presents high-resolution crop mapping for the Mediterranean area, which is particularly interesting due to the limited data availability and the high level of land use heterogeneity. The main European land use dataset, Corine Land Cover (CLC), lacks the specificity required for accurate agricultural classification, especially for crop differentiation, and does not provide frequent or timely updates, which are crucial for many applications. Other more recent EU-wide crop mapping efforts (d’Andrimont et al, 2021) still lack regional accuracy due to widely scattered training data. To overcome these limitations, a large-scale crop mapping initiative was implemented in Sardinia to test and validate an artificial intelligence–based approach for Mediterranean environments. In this context, irrigated agriculture is a key sector for the sustainable management of limited water resources. The method uses Sentinel‑2 time series and survey data from the LPIS (Land Parcel Identification System). The study relies on IOTA2, a land‑use map production chain first developed and tested at the French level, producing maps with 24 land‑use classes. The originality of the approach lies in the use of open‑source satellite images and an automated processing workflow based on supervised classifiers, making crop mapping faster and easily reproducible across years.

Learning samples are derived from 2018 LPIS data, supplemented by CLC and CLCplus Backbone datasets for natural areas and the Urban Atlas for urban areas. Two nomenclatures are tested: a detailed versus a simplified one, with 32 and 26 thematic classes, respectively, both focusing on Mediterranean-relevant crop typologies. The two nomenclatures are evaluated with sampling rates of 10%, 50%, and 100% of training pixels. Results show that the simplified nomenclature achieves higher accuracy, with an Overall Accuracy (OA) of 0.77 compared to 0.61 for the detailed nomenclature, using 100% training pixels. Increasing the training sample rate improves classification quality in both nomenclatures: in the short nomenclature, OA values are 0.596, 0.613, and 0.774 for 10%, 50%, and 100% sampling rates. In the detailed nomenclature, the improvement is weaker, with OA values of 0.596, 0.601, and 0.610, indicating that increasing sample size does not resolve class confusion. Among agricultural classes, rice, citrus, vegetables, and grapevine achieve the highest classification scores, which are among the crops with the largest irrigation requirements. Nuts, cereals, and fruit trees perform poorly, mainly due to insufficient training samples. Overall, the proposed nomenclature significantly improves the crop classes available in the CLC by increasing crop specificity and differentiation. This study presents a framework for fully automatic crop‑map production in Mediterranean environments, ensuring fast reproducibility over the years thanks to the use of openly accessible satellite imagery and an automated processing chain. This can improve the accuracy and reliability of water accounting for the agricultural sector and help promote sustainable use of limited water resources in the Mediterranean areas.

How to cite: Borgo, A., Thierion, V., Trabucco, A., Lupia, F., and Debolini, M.: High-resolution crop map generation in Mediterranean environments using IOTA2 chain, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11687, https://doi.org/10.5194/egusphere-egu26-11687, 2026.

EGU26-16751 | ECS | PICO | HS6.10

A Method for Assessing Trophic Status of Inland Lakes Based on the Forel–Ule Index and Red-edge Band Hue Angle 

Huizi Zhao, Yin Cao, Hongli Zhao, Huaiwen Zhang, Wenjing Hua, Yu Gan, and Haojiang Li

Eutrophication in inland lakes has become increasingly prominent, often accompanied by frequent algal blooms and risks of degraded aquatic ecosystem functions. Therefore, broad and dynamic monitoring of lake trophic status is crucial for aquatic ecosystem protection and refined water-resources management. Satellite remote sensing enables rapid, large-area monitoring of lakes. Previous studies have developed large-scale trophic status assessment methods based on the visible-band water color index, the Forel–Ule Index (FUI), to retrieve a chlorophyll-a-referenced trophic state index (TSI(Chl-a)). However, inland waters are optically complex; high concentrations of chromophoric dissolved organic matter (CDOM) or suspended matter can inflate FUI values. Consequently, the single-index approach using FUI alone to assess TSI(Chl-a) (Model1) tends to misclassify mesotrophic waters as eutrophic. To mitigate this interference, most studies have adopted an improved strategy in which FUI serves as the primary indicator and specific spectral bands provide auxiliary discrimination. In this study, we incorporate two medium-to-high resolution satellites with red-edge bands, GF-6 and Sentinel-2, and design Red-edge Band Hue Angle α’(RHA α’) based on bands Red(630nm-690nm/650nm-680nm),Red-edge1(690nm-730nm/698nm-713nm),Red-edge2(730nm-770nm/733nm-748nm). We then develop a coupled lake trophic status assessment method integrating FUI and RHA α’ (Model2).

The results indicate that: (1) RHA α’ can characterize the reflectance-peak feature of chlorophyll-a near 700 nm. For waters with FUI ≥ 11, if elevated FUI is primarily driven by high chlorophyll-a concentrations, RHA α’ tends to be high; conversely, if elevated FUI is mainly caused by high suspended matter concentrations, RHA α’ tends to be low. Thus, Model2 can effectively distinguish high-chlorophyll waters from highly turbid waters by leveraging RHA α’. (2) Using the IOCCG Hydrolight simulated dataset (including 500 synthetic water spectra under varying concentrations of phytoplankton pigments, CDOM, and non-pigmented suspended matter across 400–800 nm). For simulated Gaofen-6 data, the eutrophic-state monitoring assessment accuracies of Model1 and Model2 were respectively 84.1% and 95.8%, and the overall accuracies were respectively 88.6% and 90.4%; for simulated Sentinel-2 data, the corresponding eutrophic-state monitoring assessment accuracies were respectively 84.9% and 99.1%, and the overall accuracies were respectively 88.0% and 89.8%. Overall, Model2 markedly improves the accuracy of eutrophic-state assessment. (3) Taking 252 spatially representative lakes across China as monitoring targets, we produced lake trophic status products for 2021–2022 using Model2 and validated them against the National Surface Water Quality Report released by the Ministry of Ecology and Environment of the People’s Republic of China, achieving an overall accuracy of 79.84%.

In the next step, we will extend this method to long-term spatiotemporal analysis of TSI(Chl-a) for Chinese lakes with an area of 1 km² and above. The data preparation has been largely completed, and the related analyses are currently underway.

How to cite: Zhao, H., Cao, Y., Zhao, H., Zhang, H., Hua, W., Gan, Y., and Li, H.: A Method for Assessing Trophic Status of Inland Lakes Based on the Forel–Ule Index and Red-edge Band Hue Angle, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16751, https://doi.org/10.5194/egusphere-egu26-16751, 2026.

EGU26-21227 | ECS | PICO | HS6.10

High-resolution net-shortwave and net-radiation products for Europe 

Karan Mahajan, Ye Tuo, and Jian Peng

Net radiation (Rn) is a key control on land–atmosphere exchanges and a primary forcing for transpiration modelling. However, commonly used radiation products often lack the spatial resolution required to resolve soil–plant–atmosphere interactions in heterogeneous landscapes, limiting their applicability for water management studies. Here, we present a new daily net shortwave and net radiation dataset for Europe at 1-arcminute (~1.4 km) spatial resolution covering the period 2001–2020, developed to support high-resolution transpiration modelling using the Priestley–Taylor approach.

The dataset is generated through the integration of multiple complementary data sources, combining station-based downward shortwave radiation from the EMO-1 dataset, satellite-derived longwave radiation from the ELITE product, and a physically based estimation of blue-sky albedo derived from GLASS black and white-sky albedo products, together with information on photosynthetically active radiation from BESS.

Evaluation against FLUXNET observations reveals that the high-resolution net shortwave radiation product outperforms the coarser ERA5-Land reanalysis across 7 of 9 analyzed European countries, with particularly strong improvements in topographically complex regions, such as the Alps, and in heterogeneous land-use areas. However, the net radiation product shows larger uncertainties in semi-arid regions and during high-latitude winter conditions, reflecting known limitations in satellite-based radiation retrievals. Nevertheless, the high spatial resolution represents a valuable contribution to remote-sensing-based water cycle studies, drought assessment, and land-surface modeling.

How to cite: Mahajan, K., Tuo, Y., and Peng, J.: High-resolution net-shortwave and net-radiation products for Europe, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21227, https://doi.org/10.5194/egusphere-egu26-21227, 2026.

Growing demand for dates—driven by Morocco’s rising population and expanding international markets for well-branded, high-value (“noble”) varieties—is accelerating the expansion of date-palm cultivation in the country’s arid and semi-arid frontiers. Much of this growth relies on intensified groundwater pumping, increasing pressure on shared aquifers that have long sustained oasis agroecosystems. Historically, these aquifers were managed through locally embedded water-sharing institutions that supported efficient allocation and long-term use; the rapid spread of pumped irrigation is nowadays reshaping this balance by amplifying competition for the same resource.

We investigate these dynamics in the Figuig oasis (eastern Morocco) and its watershed by linking agricultural expansion to water demand and comparing this demand with watershed-scale water availability. We hypothesize that once recently established plantations reach full productive age—beyond the relatively low-demand establishment phase—total water demand will exceed the catchment’s available supply.

We develop a date-palm water demand model that combines evapotranspiration-based water requirements with high-resolution mapping of fields and palm abundance. Field boundaries are delineated using a U-Net + watershed segmentation workflow, and palm trees are detected and counted using a YOLO object-detection model applied to drone imagery (small, heterogeneous oasis parcels) and satellite imagery (newer, larger, more homogeneous plantations). These tools are applied within a remote-sensing time series to quantify agricultural expansion and the associated increase in demand over time. Field surveys provide key parameters to translate mapped plantations into water demand, including irrigation method and efficiency, tree age classes, irrigation frequency, and planting density.

Our results indicate an approximately threefold expansion of agricultural land relative to the historically stable oasis area. About 65% of farms remain in early production stages (0–5 and 6–12 years), when water needs are relatively low, yet estimated demand already nearly matches watershed-scale availability. As plantations mature, projected demand is likely to surpass catchment-scale availability within the next decade, increasing the risk of irreversible impacts. Consistent with this trend, we observe drying of traditional springs and deteriorating water quality, underscoring the need to prioritize surface-water use and water-harvesting measures and to strictly regulate groundwater pumping.

How to cite: Boubou, Y.: From Oasis Water Commons to Expanding Date-Palm Plantations: Deep Learning Mapping and Evapotranspiration-Based Water Demand in the Figuig Oasis, Morocco, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21477, https://doi.org/10.5194/egusphere-egu26-21477, 2026.

EGU26-2189 | Posters on site | EOS1.5

Evaluating the Impact of a Culturally Localized Geo-Visualization Platform on Geoscience Learning: The Arabic Earth Now Platform Study 

Zahrah A. Almusaylim, Rawan Alajmi, Nouf Alsinan, Wafa Alajmi, and Ahad Alnasser

Arabic Earth Now (AEN) is an interactive data visualization platform, initially developed as a localization version of NASA’s Eyes to visualize satellite data and provide learning about Earth and space science. AEN is further extended into a geo-dome globe simulator to support immersive, spatially rich learning experiences. Despite advances in geo-visualization, there remains a critical gap in research on how cultural localization and immersive presentation formats influence geoscience education, particularly among Arab learners. Moreover, the potential of localized platforms to enhance awareness of national scientific contributions remains underexplored. Hence, our contribution in this study how interaction with AEN and the simulator influences user engagement and learning outcomes. It contributes new insights into the role of culturally relevant data visualization in geoscience education. We examined the impact of AEN in a festival educational event to assess the students' engagement with AEN. Students of primary, inetrmeidate, secondary and univerysity graduation levels, were assigned to interact with AEN and the simulator in order to assess their impact before and after engagement. The students first experience the platform, and subsequently, they provide their feedback about their experience via an anonymous questionnaire. Students have shown a high level of engagement after interacting with AEN and indicate their motivation and higher intentions to reuse it again. Our findings demonstrate that AEN offers a highly engaging educational experience, as evidenced by the collected data. However, the analysis reveals a gap in effectively integrating geo-education with geo-visualization tools to enhance student participation. These results underscore the critical role of data visualization in enriching educational content and suggest that its strategic implementation can significantly improve both student and educator engagement within geoscience learning environments and fostering greater engagement. Additionally, the study highlights the value of localizing NASA’s Eyes as AEN to serve as an effective tool for geoscience learning.

How to cite: A. Almusaylim, Z., Alajmi, R., Alsinan, N., Alajmi, W., and Alnasser, A.: Evaluating the Impact of a Culturally Localized Geo-Visualization Platform on Geoscience Learning: The Arabic Earth Now Platform Study, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2189, https://doi.org/10.5194/egusphere-egu26-2189, 2026.

EGU26-3084 | ECS | Posters on site | EOS1.5

Making hazard maps more intuitive: A 3D interactive visualisation framework for representing hazard flows 

Júlia Sánchez-Martínez, Josep Maria-García, Jaume Cusachs, Carlos Marín, Arnau Lagresa, Marta López-Saavedra, Xavier Arnau-Sarabia, Marc Martínez-Sepúlveda, Iris Schneider-Pérez, Mireia Jiménez-Llobet, and Joan Martí

Traditional 2D hazard maps often struggle to convey the complex spatial dynamics of natural hazards, particularly for users who are not accustomed to interpreting cartographic products. This limitation hinders effective risk communication and reduces the ability of local stakeholders to identify exposed areas. To address this challenge, we develop a 3D visualisation framework that transforms model outputs into intuitive, interactive representations aimed at supporting preventive planning and informed decision making.

As an initial implementation, we apply the approach to lava-flow modelling. Using one million Monte Carlo simulations, we estimate the probabilistic envelope of potential lava trajectories and extract a random subset of 10.000 paths to obtain a representative sample of the most recurrent routes. Each trajectory is interpreted as the path of a lava droplet and its interactive 3D rendering highlights the most likely flow channels of a specific simulation. By integrating infrastructure, municipalities, roads and buildings directly within the 3D environment, the tool enables non-expert users to visualise potential scenarios with greater clarity and anticipate protective actions.

The system is hazard-agnostic and constitutes a core component of a developing multi-risk evaluation platform that will incorporate spatial and temporal analyses, simulation modelling and automated 3D representations for multiple natural hazards. The 3D representation can be extended to other hazard types, offering a general framework to bridge the gap between simulation-based hazard analysis and accessible 3D communication tools.

This study was developed within the project Volcanic disaster risk management for the Canary Islands (Spain), funded by EC ECHO - Union Civil Protection Mechanism (UCPM), ref. 101193100 VOLCAN (2025-2026).

How to cite: Sánchez-Martínez, J., Maria-García, J., Cusachs, J., Marín, C., Lagresa, A., López-Saavedra, M., Arnau-Sarabia, X., Martínez-Sepúlveda, M., Schneider-Pérez, I., Jiménez-Llobet, M., and Martí, J.: Making hazard maps more intuitive: A 3D interactive visualisation framework for representing hazard flows, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3084, https://doi.org/10.5194/egusphere-egu26-3084, 2026.

EGU26-3909 | Posters on site | EOS1.5

Transforming volcanic landscapes into knowledge: geoheritage, virtual reality technologies, and geotourism in the Canary Islands 

Thais Siqueira, Juana Vegas, Gonzalo Lozano, Carmen Romero, Ana Cabrera, Rayco Morrero, Nieves Sánchez, Ramón Casillas, Olaya Dorado, David Sanz-Mangas, Lucía Saez-Gabarrón, and Inés Galindo

The volcanic landscapes of the Canary Islands constitute one of the most 
distinctive geoheritage assemblages in Spain, underpinning the scientific, 
educational, and touristic values that have contributed to the Spanish Inventory of 
Geological Sites of Interest (IELIG), multiple natural protected areas and UNESCO 
designations. These volcanic features not only embody an extensive record of 
geological processes but also offer an exceptional basis for sustainable tourism 
initiatives. In this context, the project ‘Canary Islands: Destination of Volcanoes’, 
seeks to establish a science-based geotourism product capable of enhancing 
public engagement while strengthening the conservation and responsible use of 
natural resources. The project employs a comprehensive methodology structured 
into nine main activities that integrate fieldwork, analytical procedures, and digital 
data processing. Building on the 300 geosites identified in the IELIG for the Canarian 
Archipelago, a specific assessment framework has been designed to select the 50 
volcanic environments with the highest scientific, educational, and tourist
potential. This process combines standards and requirements of sustainability, 
conservation status, degradation risk, accessibility, safety, and scenic-scientific
values. The selected sites are being documented through the development of digital 
mapping products, adhering to international standards for spatial data quality and 
metadata. Complementary tasks include the acquisition of high-resolution drone 
imagery, photogrammetry, and 3D geological reconstructions that support the 
creation of virtual and augmented reality models. These digital products will serve 
to design interpretive scripts, animations, and immersive environments that aim to 
communicate complex geological processes in an accessible way to the general 
public. Additional activities address the creation of a unified geotourism brand, 
development of training programmes for local employment, and support for 
emerging business initiatives in the blue and green economy. Although the results 
are still in progress, the project is expected to generate a robust portfolio of 
scientifically validated and technologically innovative tools that enhance the 
touristic use and outreach of volcanic heritage. The integration of digital maps, 
VR/AR applications and scientific communication through innovation has the 
potential to diversify the regional geotouristic model, reduce environmental impact, 
and strengthen long-term conservation strategies. Ultimately, this initiative aspires 
to position the Canary Islands as an international reference for volcano-based 
geotourism grounded in science, sustainability, and innovation.
Sub-Project 1 ‘Canary Islands, destiny of Volcanoes’ is funded by PROMOTUR 
Turismo Canarias, S.A. through Next Generation EU funds, PRTR. 2024krQ00nnn.

How to cite: Siqueira, T., Vegas, J., Lozano, G., Romero, C., Cabrera, A., Morrero, R., Sánchez, N., Casillas, R., Dorado, O., Sanz-Mangas, D., Saez-Gabarrón, L., and Galindo, I.: Transforming volcanic landscapes into knowledge: geoheritage, virtual reality technologies, and geotourism in the Canary Islands, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3909, https://doi.org/10.5194/egusphere-egu26-3909, 2026.

To understand disaster damage, ground-based and aerial photographs taken during or after hazard events are commonly used. These images are also valuable for geography education. In particular, university students in a teacher training course are required to understand the characteristics of disasters, as they will be responsible for teaching these topics to their pupils. However, some students have difficulty achieving a sufficient understanding of actual disasters because of differences in scale among photographs taken from airplanes, drones, and ground-level viewpoints. To facilitate students’ understanding of disasters, it is necessary to develop a teaching program and educational materials that can connect geospatial products across multiple spatial scales. In this study, we designed a one-day workshop program integrating GIS and VR technologies. The workshop enabled students to learn about the impacts and damage of the 2024 Noto Peninsula Earthquake while operating two GIS applications and a VR device, allowing them to observe the area from different viewing perspectives. The workshop consisted of four parts: (1) a lecture on basic concepts of GIS and remote sensing, (2) a short lecture summarizing the 2024 Noto Peninsula Earthquake and a WebGIS-based comparison of aerial photographs taken before and after the disaster, (3) visualization of damaged buildings and terrain using QGIS, and (4) five minutes of VR-based fieldwork using a head-mounted display. Each section lasted 90 minutes. The second section was conducted in groups of four students. This workshop was conducted as part of a graduate school course in a teacher training program, with a total of eight students participating. Students’ learning outcomes in each section were assessed through a questionnaire survey. The results indicate that although individual materials have limitations in representing regional characteristics, integrating educational materials across multiple spatial scales deepened students’ understanding of the disaster. In particular, VR-based fieldwork enhanced students’ understanding of actual disaster damage, such as collapsed buildings.

How to cite: Yamauchi, H., Iizuka, K., and Ogura, T.: Implementation of a Workshop for Disaster Education on the 2024 Noto Peninsula Earthquake Using Multi-Scale Geospatial Products Integrating GIS and Immersive VR, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4335, https://doi.org/10.5194/egusphere-egu26-4335, 2026.

EGU26-4938 | Posters on site | EOS1.5

Improving ACTRIS Scientific Outreach through Immersive Virtual Tours 

Ariane Dubost, Misha Faber, Sabine Philippin, Galane Peyre, Zhuoqun Wu, and Dimitrii Krasnov

The rapid advances in computing, multimedia, and virtual reality technologies provides new opportunities for communicating and visualising scientific information. Virtual Tours (VTs), based on 360-degree imagery enriched with multimedia content such as graphical explanations, audio and videos offer a user-friendly way to explore scientific facilities. Interactive navigation enables users to understand research infrastructures from multiple perspectives and at their own pace.

Within ACTRIS-FR, the French component of the European Aerosol, Clouds and Trace Gases research infrastructure (ACTRIS), VTs serve as a tool to make atmospheric research stations more accessible and transparent. Historically, ACTRIS facilities - observation stations, mobile platforms and atmospheric simulation chambers -  have often been associated with limited accessibility due to security, safety, or logistical constraints. VTs break down these barriers by providing a realistic and informative representation of the facilities, enabling students, visiting researchers, and the general public to better understand the scale, layout, and purpose of the instruments and measurements before visiting, or even when travel is restricted.

The approach was developed in collaboration with SMEAR Estonia platform’s developer. The methodology allows the creation of tailored content for different target groups: for example, technicians may access specific data sets/curves, and documentation, while educators and outreach professionals can integrate simplified explanations, posters, links to videos  to support teaching. This flexibility allows a single VT to be easily tailored to various uses, ranging from outreach and training to scientific communication and access preparation.

Several ACTRIS-FR sites already use VTs to strengthen their visibility and foster greater user interaction. The tours can be embedded in websites, communication materials via QR codes and showcased at conferences, exhibitions, or as part of transnational access projects such as the Horizon Europe project IRISCC. They also support ACTRIS’s broader mission of modernising outreach activities and improving interaction with the education sector and the general public.

Developing high-quality VTs poses several challenges, as producing accurate and meaningful content requires significant involvement from scientists and technical staff, along with time-consuming data collection and careful attention to visual resolution and metadata consistency. 

The poster will outline the development of ACTRIS France VTs, discussing both the benefits and limitations, while also exploring opportunities to integrate multimedia. It will also emphasize the value of VTs as training tools for technicians, scientists, and students, and their potential to enhance accessibility, transparency, and cross-country collaborations. The tours not only facilitate a deeper understanding of the work being conducted at the facility, but also contribute to raising general awareness and knowledge about distributed research infrastructures, promoting a broader appreciation of the complex research ecosystem.

 

How to cite: Dubost, A., Faber, M., Philippin, S., Peyre, G., Wu, Z., and Krasnov, D.: Improving ACTRIS Scientific Outreach through Immersive Virtual Tours, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4938, https://doi.org/10.5194/egusphere-egu26-4938, 2026.

EGU26-6335 | ECS | Posters on site | EOS1.5

BrainFood: Semi-immersive, 360° learning experiences to communicate research on fish cognition, food quality, and climate change 

Laura E. Coulson, Eva Feldbacher, Barbara Köck, Gabriele Weigelhofer, Andreas Zitek, and Libor Zavorka

Climate change is reshaping aquatic food webs, altering the dietary quality available to fish and, with it, their cognitive performance, behavior, and fitness. Because wild fish are a critical source of omega-3 polyunsaturated fatty acids (PUFAs) for humans, the ecological and societal relevance of these changes transcends aquatic systems. BrainFood is a science communication initiative that translates the research of the 4FatQs project—on the role of omega-3 PUFAs for cognition in wild fish—into accessible, engaging, and evidence-informed digital learning experiences for broad audiences.

BrainFood deploys a suite of 5 interactive short stories (each ≤5 minutes), built with 360° images and videos hosted in the CenarioVR environment and accessible via web link or QR code on smartphones, tablets, laptops, and optional VR headsets. The stories interlink methods, findings, and implications of 4FatQs through multimodal elements—narrated video, animated gifs, audio overlays, quizzes, and mini-games—allowing non-linear exploration without cognitive overload. Example modules include “A Day in the Life of Trout,” which introduces tracking technologiesto study movement and behavior, and “Hide and Seek!”, a game-based exploration of camouflage and rapid color change in salmonids. Additionally, the stories have a strong focus on how this information was generated – a key element of science literacy. All materials are designed for inclusion and accessibility (high-contrast layouts, dyslexia-friendly fonts, voice-over options, and alternatives for those with hearing impairments).

BrainFood’s originality lies not in technological novelty, but in the strategic integration of: (i) multi-device, low-barrier 360° learning experiences; (ii) targeted deployment through multiplier venues and events; (iii) rigorous, real-time co-creation and optimization; and (iv) explicit alignment with science literacy goals. By foregrounding methods as well as findings, the platform demystifies how aquatic ecologists generate evidence—field observation, mesocosm experiments, laboratory analyses—and reveals cascading links between climate, food quality, cognition, and ecosystem health.

A distinctive feature of BrainFood is its co-creation and evaluation pipeline. The initial pilot set of five stories will be deployed at the Haus der Wildnis visitor center (Lunz am See, Austria) and an additional 10 stories will be created based on the feedback from our pilot users.

How to cite: Coulson, L. E., Feldbacher, E., Köck, B., Weigelhofer, G., Zitek, A., and Zavorka, L.: BrainFood: Semi-immersive, 360° learning experiences to communicate research on fish cognition, food quality, and climate change, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6335, https://doi.org/10.5194/egusphere-egu26-6335, 2026.

EGU26-6754 | ECS | Posters on site | EOS1.5

GeoProVE – How to use Virtual Reality for Geophysics Education 

Joeri Brackenhoff, Paula Rulff, Jinqiang Chen, Pierre-Olivier Bruna, Alexandros Daniilidis, Arend-Jan Krooneman, Yosua Pranata Andoko, and Arno Freeke

Geophysics education is often challenging, as it entails explaining complicated physical processes that take place inside the Earth. Because these processes happen below the surface of the Earth, it can be difficult for students to connect to the material and understand what is happening. As a result, it is hard for students to make the link between the abstract explanation of the processes to the physical measurements that are performed during fieldwork.

A novel way to close the gap between theory and fieldwork is the use of Virtual Reality or VR. VR allows a student to fully immerse themselves into a digital twin of reality and to experience and visualize processes that are invisible in real life. This is the purpose of Geoscience Processes Virtual Education or GeoProVE. In this application, we have developed a fully immersive and interactive scenario where a student can learn about Ground Penetrating Radar or GPR. The use performs a GPR measurement along a line and is guided with questions to understand how the data are acquired and why specific patterns arise. One of the major features is the ability to pull the subsurface out of the ground, to see how the waves propagate through the subsurface and interact with objects, such as pipes and the water table, in the subsurface. Several setups with increasing complexity are shown to the students, with a strong emphasis on challenge-based learning through a scoring system.

 

Aside from the GPR scenario, a scenario focused on offshore 3D seismics is also in development for GeoProVE, with the aim to create additional scenarios focused on ERT and geothermal applications. GeoProVE is intended to become fully open source so other developers can contribute to the knowledge base. The application has shown positive engagement from students for geophysics education. We will demonstrate the development of GeoProVE along with its main features.

How to cite: Brackenhoff, J., Rulff, P., Chen, J., Bruna, P.-O., Daniilidis, A., Krooneman, A.-J., Pranata Andoko, Y., and Freeke, A.: GeoProVE – How to use Virtual Reality for Geophysics Education, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6754, https://doi.org/10.5194/egusphere-egu26-6754, 2026.

EGU26-9331 | Posters on site | EOS1.5 | Highlight

The Eye of Disaster: Development and Evaluation of a VR-Based Landslide Education System 

Yachen Zhou, Boyun Yu, and Takashi Oguchi

This study develops an immersive Virtual Reality (VR) pedagogical framework to mitigate the spatial cognitive constraints inherent in conventional two-dimensional disaster-prevention education. It focuses on the 2024 Noto Peninsula landslide events in Japan. By integrating high-precision digital elevation models and satellite imagery, the produced system reconstructs post-disaster terrains to facilitate high-fidelity risk communication. The interaction logic, governed by the natural user interface principles, incorporates a multi-perspective switching mechanism that enables users to conduct comprehensive analyses of disaster sites across varying spatial scales.

 

The system architecture comprises two core modules: the "VR Geological Museum" for knowledge acquisition and the "Evacuation Simulation" for practical application, enabling deep transfer from conceptual understanding to survival skills. The former employs a task-driven strategy and a "macro-micro" dual-perspective observation model. It transforms abstract geological knowledge into intuitive interactive experiences through high-precision 3D reconstructions of landslide topography, effectively lowering the cognitive threshold for non-expert learners. Complementing this, the evacuation simulation module integrates official landslide-disaster warning area maps from the Geospatial Information Authority of Japan. Grounded in embodied cognition theory, this module implements a "trial-and-error" feedback mechanism. By navigating highly restored disaster evolution scenarios, users translate static warning information into dynamic survival capabilities, thereby completing the cognitive loop from theoretical understanding to behavioral practice.

 

The pedagogical efficacy of the system was empirically validated through a randomized controlled trial, utilizing multidimensional standardized metrics, including the Presence Questionnaire, the System Usability Scale, and the NASA Task Load Index for workload assessment. Experimental results demonstrate that the system significantly outperforms traditional text-based media in knowledge internalization, risk perception accuracy, and survival decision-making efficiency. The core contribution of this research lies in the deep integration of high-fidelity geospatial data with immersive interaction, establishing a verifiable technical paradigm for disaster education. This approach effectively dismantles barriers to professional knowledge. It enhances disaster preparedness and evacuation efficacy across diverse demographic backgrounds, providing a robust theoretical and technical foundation for the universalization of geohazard education.

How to cite: Zhou, Y., Yu, B., and Oguchi, T.: The Eye of Disaster: Development and Evaluation of a VR-Based Landslide Education System, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9331, https://doi.org/10.5194/egusphere-egu26-9331, 2026.

EGU26-10349 | Posters on site | EOS1.5

Virtual Frontiers in Earth Magnetism: The legacy of MagNetZ webinar series  

Anita Di Chiara, Greig Paterson, Daniele Thallner, Florencia Milanese, Annique van der Boon, Raquel Bonilla-Alba, Claudio Robustelli-Test, Brendan Cych, Richard Bono, and Lesleis Nagy

The MagNetZ (Magnetic Network on Zoom) webinar series stands as a cornerstone for geomagnetism research, hosting online seminars since the early 2020. Launched amid COVID-19 constraints, MagNetZ are convened by a team of scientists to give visibility to scientific work of both leading scientists and early career researchers, to foster virtual collaboration, overcoming geographical limits for students and professionals alike. It promotes open science sharing, with broad appeal evidenced by international viewership and institutional ties. The presentations typically began with a short talk, followed by interactive Q&A space, which are both recorded, post-edited and published on YouTube (https://www.youtube.com/@MagNetZ) in a continuously growing archive of recorded content. These webinars are also uploaded to the EarthRef.org Digital Archive (ERDA), development and maintained by the EarthRef.org Database Team and can be cited. Thus far, more than 80 webinars are available for viewing. MagNetZ also supports for national meetings, such as the UK-based annual Magnetic Interactions meeting, offering them the platform to view the recordings of three meetings so far. The webinars provide in-depth discussions on paleo- and rock-magnetism and geomagnetic modeling, with topics spanning from geo- and planetary magnetic field dynamics to mineral properties studies for paleoclimatic reconstructions, from paleomagnetic data for geodynamic applications to archaeomagnetism, and more. The core features of MagNetZ are to ensure accessibility for all genders, all career stages and geographical distributions, enhancing community networks, and serving as an educational hub for magnetic data in tectonics and climate studies. Its YouTube platform ensures enduring access, sparking collaborations and awareness.

How to cite: Di Chiara, A., Paterson, G., Thallner, D., Milanese, F., van der Boon, A., Bonilla-Alba, R., Robustelli-Test, C., Cych, B., Bono, R., and Nagy, L.: Virtual Frontiers in Earth Magnetism: The legacy of MagNetZ webinar series , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10349, https://doi.org/10.5194/egusphere-egu26-10349, 2026.

EGU26-13637 | Posters on site | EOS1.5

VR-GeoLab: a platform for multi-hazard understanding and risk communication 

Mihai Micu, Cristina Dumitrica, Bianca Mitrica, Gabriela Morosanu, and Irena Roznovietchi

During the last decades, the compound effect of natural hazards such as landslides, floods/flash floods, and earthquakes highlighted a priority field of scientific research (multi-hazard risk), derived from the close connection with their increasingly accentuated impacts on society and the environment. The amplification of consequences as a result of complex interaction mechanisms leads to increased exposure and prolonged recovery time for affected communities, thereby reducing overall resilience. The very recent development of new theoretical-methodological concepts, such as Virtual Reality (VR) offers enhanced opportunities to explore evolutionary processes of landforms and resulting landscapes, enabling the discovery, calibration, and validation of advanced solutions for risk perception, understanding, awareness, communication, and management. In this context, and in response to the challenges of the contemporary period, marked by rapid environmental changes, a new VR platform started to be developed within the SPEER-A (Interreg) project, focusing on the Vrancea seismic region, the most important intermediate-depth seismic source of Europe, an area intensely affected by earthquakes, landslides, and flash floods. The objectives of the VR-GeoLab are: i) to create a VR-based transdisciplinary solution (following a co-creation, co-design, and co-dissemination approach) of real-time interaction of scientific research products with other stakeholders involved in the management of multi-hazard scenarios, which ii) integrates the results of scientific research into a modern, enhanced reality and collaborative knowledge and relational framework, and iii) increase the societal resilience by improving the spatio-temporal perception of the multi-hazard environments through immersive, virtual representations of hazards’ interaction, conditioning factors, exposure, and vulnerability. In this way, VR-GeoLab provides an innovative platform for promoting scientific resultsto a wide range of stakeholders, in a multi-dimensional integrated, interactive, immersive and collaborative way, thus contributing with consistent added value not only for educational promotion and capacity building, but also for opening new research horizons through the integration of advanced digital interaction tools in future applications of international research and educational projects. Acknowledgements: this work is supported by the Interreg NEXT Black Sea Basin Programme under grant agreement no. BSB01197 - Strengthening and Promoting Earthquake Emergency Response and Rescue Capacity in the BSB Area (SPEER-A).

How to cite: Micu, M., Dumitrica, C., Mitrica, B., Morosanu, G., and Roznovietchi, I.: VR-GeoLab: a platform for multi-hazard understanding and risk communication, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13637, https://doi.org/10.5194/egusphere-egu26-13637, 2026.

EGU26-17240 | Posters on site | EOS1.5

The Sentinel Dehesa: A virtual tour of an ICOS Research Ecosystem Station 

Javier Pacheco-Labrador, Eduardo de la Cal Martín, M. Pilar Martín, Tarek S. El-Madany, María Dolores Raya-Sereno, Vicente Burchard-Levine, Lucía Casillas, Juan Ramón Bustos-Caparrós, and Jorge Lagranja

Scientists from different disciplines (e.g., eddy covariance fluxes, remote sensing, or ecology) work together at long-term ecosystem stations to monitor ecosystem responses to climate change. These stations are heavily equipped with automated sensors that continuously measure, and they support regular campaigns in which scientists take numerous samples and measurements. Networks of these stations have provided a critical understanding of ecosystems' responses to extreme events and other consequences of climate change, and therefore, society must be aware of the relevance of this kind of infrastructure. However, presenting these stations to the general public or students is complex, as they may be located in isolated areas, and hosting large numbers of visitors can perturb the ecosystem, affecting observations and their interpretation. Furthermore, the diversity of topics and knowledge gathered in these stations can overwhelm communication.

In this context, virtual reality offers unmatched advantages to bring the general public to these research stations from anywhere. We present the “Sentinel Dehesa” virtual tour, a virtual reality environment of the ecosystem station at Majadas de Tiétar, in Cáceres, Spain, which is included in the International Carbon Observatory System (ICOS). The station monitors a Mediterranean savanna, an agroecosystem characterized by its sustainability but jeopardized by climate change. The station continuously measures surface-atmosphere energy, carbon, and water fluxes using micrometeorological and eddy covariance techniques. Furthermore, remote sensing scientists conduct regular campaigns to measure vegetation spectral and biophysical properties and relate them to satellite imagery. In this virtual environment, visitors can learn about the sensors and measurements performed on the site as they move through different information points that provide multilingual content.

This virtual tour is available both for VR goggles and web browsers (https://speclab.csic.es/en/) and has been used for educational and outreach activities, attracting the interest of secondary students and being highly valued by their teachers.

How to cite: Pacheco-Labrador, J., de la Cal Martín, E., Martín, M. P., El-Madany, T. S., Raya-Sereno, M. D., Burchard-Levine, V., Casillas, L., Bustos-Caparrós, J. R., and Lagranja, J.: The Sentinel Dehesa: A virtual tour of an ICOS Research Ecosystem Station, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17240, https://doi.org/10.5194/egusphere-egu26-17240, 2026.

Many widely used tools for communicating and teaching Earth observation and modeled climate data still struggle to convey spatiotemporal phenomena: visualization is often limited to 2D map views, interfaces can prove difficult for non-experts, and workflows might not be easily transferred from curated examples to large or in-progress research datasets. This creates a gap between public-facing visualization, classroom use and research workflows.

We present Lexcube, a multi-platform ecosystem for interactive exploration and visualization of Earth system "data cubes", i.e., large remote sensing and modelled data sets. Lexcube provides an immersive, interactive 3D "data cube" view where all dimensions (space and time) are treated equally, enabling users to easily reveal spatiotemporal dynamics that are not visible in 2D map-based interfaces. Over its development in the last years, Lexcube has been used in education, outreach, and research. Our goal was to emphasize intuitive navigation and low barriers of entry while being capable of visualizing large data sets with minimal hassles. Lexcube has been deployed in multiple forms: 

  • (1) Lexcube.org, an interactive data cube exploration and visualization web app, with no coding or infrastructure required. It runs on desktop and mobile devices with minimal hardware requirements, and has been regularly used in teaching.
  • (2) Lexcube for Jupyter, an open-source Python package aimed at scientists, that allows to visualize any 3D data set as an interactive data cube in Jupyter notebooks.
  • (3) Two museum exhibits, featuring simplified versions of the Lexcube.org interface with curated data sets and explainer texts relevant to its respective exhibition.
  • (4) A physical interactive data cube, a large museum-style installation that displays data cubes in physical space through five square touch screens assembled in the shape of a cube, offering the same capability and data sets as Lexcube.org, but proving even more accessible as no virtual 3D environment or software has to be navigated at all.
  • (5) The option to create physical paper data cubes from templates generated by Lexcube, assembled by cutting and glueing, offering a low-cost and engaging piece of science communication.

In the future, we are looking to strengthen the education and science communication use cases for the Lexcube platform and are very interested for feedback and ideas for possible future developments. These could include a virtual reality deployment to particularly explore extreme events as 3D voxel clouds over space and time as well as offering simple data processing operators beyond pure data visualization.

How to cite: Söchting, M. and Mahecha, M. D.: Lexcube: A multi-platform "data cube" ecosystem for immersive exploration of Earth system datasets in education, outreach and research, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17709, https://doi.org/10.5194/egusphere-egu26-17709, 2026.

EGU26-20973 | ECS | Posters on site | EOS1.5

Interactive 3D Earth Models in Unity to Visualise TROPOMI Satellite Climate Data 

Amarpal Sahota, Elena Fillola, Adrian K. T. Ng, Jeff Clark, Nawid Keshtmand, Matt Rigby, and Raul Santos-Rodriguez

Climate data is complex to understand and observe with variables covering the 3D surface of the globe and changing over time. Consequently, it is difficult to convey rich climate information through static 2D images. To address this, we designed and built interactive 3D Earth models using the Unity game engine to visualise data from the Tropospheric Monitoring Instrument (TROPOMI) satellite. These virtual world models aim to help researchers share climate insights more effectively and make them accessible to the public. 

The immersive environment presents the use of a number of 3D Earth objects. The first one displays the ‘XCH4’ variable (column-averaged dry-air mole fraction of methane in ppb) for an entire month, allowing the user to cycle through months via a controller. The Earth spins on its axis, automatically displaying methane concentrations, while the user can manually adjust the view to inspect regions of interest. A second Earth object features an automatic animation displaying the density of data points collected by the TROPOMI satellite as days progress. We also render auxiliary reference globes without thematic overlays and include a 2D static plot of atmospheric Methane (ppb) for 2023 for comparison. 

The entire layout is optimised for immersive systems, specifically where the user is positioned centrally within a 360-degree display ring, such as the Reality Emulator at the University of Bristol, a VR-enabled ‘CAVE’ system. Audience feedback thus far has been highly positive: the immersive 3D visualisation gave participants a clearer view of methane concentrations across the Earth and deepened their interest in the planet’s atmosphere. It also sparked curiosity about the factors affecting atmospheric composition, prompting many questions about methane sources and satellite monitoring. This setup demonstrates the potential of virtual reality in communicating high-dimensional earth science data. 

How to cite: Sahota, A., Fillola, E., K. T. Ng, A., Clark, J., Keshtmand, N., Rigby, M., and Santos-Rodriguez, R.: Interactive 3D Earth Models in Unity to Visualise TROPOMI Satellite Climate Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20973, https://doi.org/10.5194/egusphere-egu26-20973, 2026.

EGU26-23234 | Posters on site | EOS1.5

Using Virtual Reality to Support Hazards and Risk Education 

Bruce Malamud, Elizabeth Follows, and Finlay Trasler

Teaching hazards and risk often requires engagement with complex, dynamic and inaccessible environments. Virtual reality (VR) provides a practical means of supporting immersive, place-based learning. This contribution presents the use of VR as a facilitated teaching tool within hazards and risk education.

VR sessions were delivered to master's and undergraduate students (one session of 12 students), 2nd year undergraduate students (two sessions of 13 students) and to pre-university Sixth Form students (two sessions of 12 students) using Meta Quest 3 and Quest Pro headsets and the Wander platform of global Google Street (and user uploaded) images. The sessions included virtual visits to hazard-relevant locations, including informal settlements in Kenya, earthquake-affected urban environments in Japan (using before-and-after imagery to examine building tilt), rockfall-prone landscapes in Nepal, time-lapse environmental change in Durham, a broader VR-based field trip to Israel and another session following along the coastline of a Kayaker in Oman. Each activity combined guided VR exploration with structured discussion of hazard processes, exposure, vulnerability and resilience.

The use of VR supported spatial understanding, comparison between contrasting hazard contexts, and student engagement. Key considerations included group size, facilitation, accessibility, and the importance of integrating VR with non-digital teaching methods rather than using VR in isolation. These examples demonstrate how immersive technologies can be effectively incorporated into hazards and risk education across educational levels, while highlighting the need for critical reflection on learning outcomes and evaluation.

How to cite: Malamud, B., Follows, E., and Trasler, F.: Using Virtual Reality to Support Hazards and Risk Education, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23234, https://doi.org/10.5194/egusphere-egu26-23234, 2026.

ESSI3 – Open Science Informatics for Earth and Space Sciences

EGU26-1682 | Posters on site | ESSI3.1

Beyond Retention Periods: Appraising Climate Data Across Complementary Archives 

Eileen Hertwig, Andrea Lammert, and Andrej Fast

Efficient long-term data archiving is essential for advancing climate research, where increasingly complex simulations generate vast and heterogeneous datasets that must remain accessible, traceable, and reusable across disciplines and timescales. To meet the diverse, user-specific needs of Earth System Sciences (ESS) researchers, the German Climate Computing Center (DKRZ) provides two complementary archival systems: WDCC and DOKU. 

The World Data Center for Climate (WDCC) serves as a formal, FAIR-aligned repository for climate model outputs and related datasets. It assigns persistent identifiers via DataCite DOIs, preserves rich and standardized metadata, and ensures interoperability, thereby promoting data sharing and supporting long-term scientific reuse. Mature datasets intended for public dissemination and sustained reuse therefore fall in the scope of WDCC. DOKU, by contrast, is a lightweight, flexible solution tailored to project- and user-specific requirements within DKRZ. It offers structured long-term storage - typically guaranteed for ten years - for data that are not (yet) ready for formal publication but remain important for internal reference, validation, or project continuity.

A central question for researchers is however not only where but also which data should be kept, and for how long. While DOKU and WDCC each provide a baseline retention period of ten years, it is worth considering whether time alone is really the decisive criterion for preservation. Instead appraisal could consider scientific value, future potential for reuse, the cost of regeneration, and broader considerations of good scientific practice. 

Starting with DOKU, appraisal criteria and workflows are currently being developed to determine the fate of archived data once the guaranteed retention period has expired. Data that continue to play a significant role within their project or show clear potential for reuse might stay longer in DOKU while others might need to go. Once this workflow has been established and tested it could also serve as a blueprint for WDCC. 

Together, WDCC and DOKU form a coherent strategy for sustainable data management, providing scalable infrastructure, FAIR principles, and software solutions that meet specific user needs while supporting responsible, long-term stewardship of climate data across the ESS community.

How to cite: Hertwig, E., Lammert, A., and Fast, A.: Beyond Retention Periods: Appraising Climate Data Across Complementary Archives, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1682, https://doi.org/10.5194/egusphere-egu26-1682, 2026.

EGU26-2668 | ECS | Posters on site | ESSI3.1

Design of a new KARI Satellite Image Service Platform 

Myungjun Lee, Jaeung Han, Gapho Jeun, and Min-A Kim

To support academic and public applications—particularly those related to disaster monitoring, urban development, and environmental studies in specific regions and events—the Korea Aerospace Research Institute (KARI) has operated the Korea Satellite Information Database (KSATDB) web service since 2017. The platform is designed to enhance the usability of Korea Multi-Purpose Satellite (KOMPSAT) imagery across various domains by selectively releasing high-resolution, tile-formatted images.
To further improve public accessibility and foster scientific research in remote sensing and geophysical applications, KARI is developing an enhanced KSATDB platform over a four-year period (2025–2028). This initiative aims to broaden open access to high-resolution satellite data and to establish an integrated infrastructure for data acquisition, processing, and dissemination.
The upgraded system consists of two principal components: the Order Request and Distribution Subsystem (ORDS), which enables orbit-based image acquisition and rapid data delivery, and the Satellite Public Service Subsystem (SPSS), which provides free access to high-resolution imagery for academic and research purposes. The SPSS is composed of two functional modules: “A base processing module” and “A web-based visualization module”.
The “base processing module” generates multi-channel, high-resolution tile datasets for disaster-affected regions, representative nature sceneries, and interesting big-cities, incorporating geometric and radiometric corrections to ensure data accuracy and consistency. The “web-based visualization module” supports real-time rendering of processed tile data in response to user interactions. It integrates advanced visualization tools, including Curtain-view and Geo-linkage comparison functions, which facilitate intuitive analysis of temporal variations.
Furthermore, the platform incorporates a spectral synthesis function to support environmental and geophysical analyses through the web interface. Through this enhancement of KSATDB, KARI aims to advance both academic research and practical applications of satellite imagery in diverse disaster-related studies and to contribute to the broader scientific understanding of remote sensing and geophysical phenomena.

How to cite: Lee, M., Han, J., Jeun, G., and Kim, M.-A.: Design of a new KARI Satellite Image Service Platform, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2668, https://doi.org/10.5194/egusphere-egu26-2668, 2026.

EGU26-2808 | Orals | ESSI3.1

Extending Data Catalogues to FAIRly Describe and Cite Ocean Science Resources: The ODATIS Experience 

Clémence Cotten, Mickaël Treguer, Erwan Bodéré, Amandine Thomas, Julien Meillon, and Erwann Quimbert

Earth system science relies on the integration of heterogeneous observations, models and methods across atmosphere, ocean, land and biosphere. In ocean science in particular, the analysis of complex and multidisciplinary datasets strongly depends on scripts, software, virtual research environments (VREs) and scientific services developed within laboratories and research projects. Yet these digital resources often remain scattered, poorly documented and difficult to discover, limiting their reuse, citation and contribution to FAIR and interoperable ocean science.

Within the ODATIS Ocean hub of the French research infrastructure Data Terra, acting as an EOSC node, we developed a concrete and operational solution to address this gap by extending a long-standing national ocean data catalogue towards a FAIR catalogue of ocean-related scientific resources. ODATIS has historically relied on GeoNetwork, the Sextant platform and the ISO 19115 standard to catalogue ocean datasets. Building on this foundation, we adapted the ISO 19115 metadata model and the Sextant catalogue to describe a wider range of resources relevant to ocean science, including scripts, software, applications, VREs, scientific support services and training materials, while remaining fully interoperable with existing data catalogues.

This development was carried out within the Gaia Data project of Data Terra, in close collaboration across thematic poles, leading to the co-design of controlled vocabularies dedicated to development languages, resource sub-types and data life-cycle stages. These shared semantic artefacts enrich metadata with machine-actionable information, enable faceted discovery, and strengthen semantic interoperability within and beyond the ocean community.

The resulting ODATIS resource catalogue is now operational and used in several national initiatives, including the PEPR BRIDGES programme and Gaia Data activities such as the “support to oceanographic campaigns” task, which develops a portfolio of services for cruise principal investigators. The catalogue provides a national entry point to assign DOIs to scripts and software required for scientific publications, while offering a visible showcase for ocean-related tools, services and VREs developed by laboratories, support teams and projects.

Beyond discovery and citation, the catalogue is designed as a foundation for a knowledge graph linking ocean datasets, processing tools, computational environments, services and training resources. This supports interdisciplinary and end-to-end use cases, connecting data to the methods and VREs required for their analysis. Community engagement is ensured through a coordinated outreach to ODATIS laboratory correspondents across France, fostering co-design, adoption and the creation of resource records. Altogether, this success story demonstrates how metadata standards, semantic interoperability and VRE-oriented cataloguing can effectively support FAIR and collaborative ocean science.

How to cite: Cotten, C., Treguer, M., Bodéré, E., Thomas, A., Meillon, J., and Quimbert, E.: Extending Data Catalogues to FAIRly Describe and Cite Ocean Science Resources: The ODATIS Experience, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2808, https://doi.org/10.5194/egusphere-egu26-2808, 2026.

EGU26-3613 | Posters on site | ESSI3.1

Open-Source Data and Digital Platforms: Catalyzing Scalable & Collaborative Solutions for Clean Air through OpenAQ 

Chris Hagerbaumer, Colleen Rosales, and Russell Biggs

OpenAQ is the world's largest open-source, open-access air quality data platform. It provides over 2 billion measurements from more than 22,500 sources across 142 countries. OpenAQ offers a standardized and harmonized approach to accessing diverse air quality data from a wide variety of air sensors and reference-grade monitors. The platform enhances data findability, accessibility, interoperability, and reusability for various research and application needs.

By openly sharing air quality data, OpenAQ maximizes the value of the data collected, leveraging the skills of interested parties in and beyond the community to produce the scientific research, communications, and evidence-based solutions needed to reduce air pollution.

OpenAQ trains groups on how to use the platform, prioritizing those working to reduce air pollution in vulnerable communities, and OpenAQ provides leadership training for emerging air quality leaders in low- and middle-income countries through its Clean Air Community Ambassador Program.

This session provides an overview of OpenAQ’s work to democratize access to air quality data, including such tools, resources and programs as: 

  • OpenAQ Explorer (https://explore.openaq.org): A user-friendly tool for visualizing and downloading harmonized air quality data from a global map
  • Programmatic Access: Advanced options for data integration that are available through the OpenAQ API and accessible via an OpenAQ R client and Python SDK
  • AQI Hub (https://aqihub.info/): A resource for understanding and comparing how different countries report Air Quality Indices (AQIs)
  • Clean Air Community Ambassador Program (https://ambassadors.openaq.org/): a program to empower the next generation of changemakers to use air quality data in support of community action

 

How to cite: Hagerbaumer, C., Rosales, C., and Biggs, R.: Open-Source Data and Digital Platforms: Catalyzing Scalable & Collaborative Solutions for Clean Air through OpenAQ, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3613, https://doi.org/10.5194/egusphere-egu26-3613, 2026.

EGU26-6042 | ECS | Posters on site | ESSI3.1

The Contribution of Open Science Research in Geosciences to Other Fields of Knowledge: A Bibliometric Analysis 

Andrey Santos, Camyla Santos, Eliana Bahia, Isaias Bianchi, and Rafael Moré

The field of Geosciences has made significant contributions to the advancement of the Open Science movement. As a data-intensive domain, its engagement with this movement has fostered the expansion and use of FAIR data, open data, open and reproducible research practices, as well as open access to scientific publications and to the databases produced within the field. This study aims to visualize and analyze the contribution of Open Science research in Geosciences to other fields of knowledge, based on the citations received by these publications. To this end, a bibliometric analysis was conducted on documents indexed in the Scopus database, selected for its consistent classification of Subject Areas, particularly those limited to Environmental Science, Earth and Planetary Sciences, and Agricultural and Biological Sciences. The search strategy yielded 2,892 publications related to Open Science in the context of Geosciences. The analysis of citing documents revealed that 27 fields of knowledge refer to these publications, with particular prominence given to Social Sciences, Computer Science, Engineering, and Energy. Additionally, keyword co-occurrence and co-authorship analyses were performed in order to identify the main research themes, patterns of scientific collaboration, and core clusters of intellectual production associated with Open Science in Geosciences. These procedures made it possible to highlight the interdisciplinary nature of the field and the role of Geosciences as a vector for the diffusion of open practices across other areas of knowledge. It is concluded that Open Science research developed within Geosciences exerts a significant influence on multiple scientific domains, contributing to the consolidation of collaborative, transparent, and data-sharing-oriented research practices.

How to cite: Santos, A., Santos, C., Bahia, E., Bianchi, I., and Moré, R.: The Contribution of Open Science Research in Geosciences to Other Fields of Knowledge: A Bibliometric Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6042, https://doi.org/10.5194/egusphere-egu26-6042, 2026.

EGU26-7484 | ECS | Posters on site | ESSI3.1

DIGIVOLCAN: a multi-parametric database for the volcano monitoring of the Canary Islands (Spain)  

Pablo López-Díaz, Luca D'Auria, Sergio de Armas-Rillo, Aarón Álvarez-Hernández, David M. van Dorth, Rubén García-Hernández, Manuel Calderón-Delgado, Víctor Ortega-Ramos, and Nemesio M. Pérez

Modern volcanic monitoring requires managing multidisciplinary, multiparametric, large-volume datasets. A robust digital framework is therefore essential for integrating, managing, storing, processing, and visualizing these data streams consistently. Here, we present such a framework, comprising a SQL-based database, a Flask web application, and an automated scheduler that ensures continuous data ingestion and updating. 

Within the DIGIVOLCAN project, led by the Instituto Volcanológico de Canarias (INVOLCAN), we developed a multiparametric database to support volcano monitoring in the Canary Islands. The database integrates data from permanent monitoring networks, discrete field surveys, and remote sensing, and is implemented using PostgreSQL. Serving as the core of the framework, the database is optimized with indexed tables that enable rapid querying, even for datasets with millions of records. Spatial data are handled with PostGIS, a PostgreSQL extension that provides efficient spatial data storage and operations. In contrast, time-series data are managed with TimescaleDB, which significantly accelerates time-series queries. Together, these technologies ensure secure storage, high performance, and seamless interaction with the DIGIVOLCAN web interface, enabling rapid visualization of large, complex datasets. 

This digital infrastructure is designed to serve multiple user communities and operational needs. It provides accessible, high-level information to the general public, more detailed datasets to civil protection authorities, and comprehensive, multiparametric analyses to scientific committees during seismo-volcanic crises. The system functions as both an operational tool for routine daily monitoring and a rapid-response platform during volcanic emergencies, delivering advanced maps and time-series visualizations. In addition, it serves as a scientific research tool by facilitating the integrated analysis and comparison of geophysical and geochemical datasets, including compatibility with advanced AI-based data analysis workflows. 

Access to the database is provided through a web portal that implements role-based access control. At the basic level, intended for the general public, users can explore an interactive, real-time earthquake map of the Canary Islands with customizable filters for intuitive visualization. Higher access levels unlock additional functionality, allowing advanced users to visualize thematic maps and time-series plots of key volcano-monitoring parameters. These include, for example, gravimetry, ground deformation from GNSS and satellite interferometry, self-potential data, discrete and continuous diffuse soil gas fluxes (CO₂, H₂S), and numerous other raw and processed geophysical and geochemical variables. 

Finally, the modular architecture of the infrastructure enables straightforward expansion and long-term evolution, supporting the integration of new monitoring parameters as well as the development of additional map types and graphical representations within the web interface. 

How to cite: López-Díaz, P., D'Auria, L., de Armas-Rillo, S., Álvarez-Hernández, A., M. van Dorth, D., García-Hernández, R., Calderón-Delgado, M., Ortega-Ramos, V., and M. Pérez, N.: DIGIVOLCAN: a multi-parametric database for the volcano monitoring of the Canary Islands (Spain) , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7484, https://doi.org/10.5194/egusphere-egu26-7484, 2026.

EGU26-9081 | Posters on site | ESSI3.1

From Data Rocks to FAIR Peaks: With NFDI4Earth’s services towards Harmonized Metadata and User-Centered Tools for Earth System Research  

Christin Henzen, Nadia Aouadi, Anna Brauer, Robert Brylka, Auriol Degbelo, Jonas Grieb, Ralf Klammer, Markus Konkol, Roland Koppe, Kemeng Liu, Tom Niers, Daniel Nüst, and Alexander Wellmann

Research data management (RDM) in the Earth system sciences is complex and can be frustrating. Data come in many shapes and formats—observations, model outputs, samples, derived products—and are spread across a wide range of repositories, services, and software ecosystems. These infrastructures differ greatly in metadata quality, interoperability, and FAIR maturity. For researchers, this often means spending too much time figuring out where to publish data, how to describe it properly, or how to reuse existing datasets and software. At the same time, expectations from funders and journals on data outputs continue to rise with respect to openness, curation, and long-term stewardship.  

Within the German National Research Data Infrastructure (NFDI), the NFDI4Earth consortium tackles these challenges by building practical, community-driven solutions for the Earth system sciences. As developers and designers, our focus is on lowering barriers and making FAIR data and software practices easier to understand and apply in everyday research. In this contribution, we introduce two key building blocks of this effort: the Knowledge Hub and the OneStop4All. 

The Knowledge Hub (https://knowledgehub.nfdi4earth.de) is a knowledge graph that connects heterogeneous Earth system resources, e.g., datasets, repositories, services, software, and educational materials, using a harmonized metadata model. It harvests metadata from multiple providers—ranging from global aggregators to national and domain-specific services (see, for instance, the Helmholtz DataHub: https://earth-data.de/) - and exposes them through a well-defined SPARQL API. This allows both humans and machines to query, explore, and reuse metadata consistently, and enables developers to build custom applications on top of it. 

The OneStop4All (https://onestop4all.nfdi4earth.de) builds on the Knowledge Hub to offer a discovery and guidance portal for researchers. It brings together the resources in a single, coherent interface. Cross-domain search and guided navigation help users move along the research data lifecycle without having to know all standards and infrastructures upfront. A central feature is the Repository Wizard, an interactive decision-support tool that helps researchers find suitable repositories for publishing their data based on data type, discipline, and policy constraints. In addition, the NFDI4Earth Label provides a transparent, community-oriented way to communicate repository quality with respect to FAIR principles, sustainability, and relevance for Earth system sciences. 

Beyond discovery, the OneStop4All puts a strong emphasis on learning and cultural change. It provides integrated access to open educational resources, good-practice guides, and showcases that demonstrate the concrete benefits of FAIR and open data and software. A domain-specific chatbot complements these resources by answering practical questions on metadata, licensing, data publication, and software citation. 

We will showcase these services, share lessons learned from development, and highlight opportunities for community contributions. From our perspective as a distributed team of designers and software developers, combining harmonized metadata, user-centered services, and hands-on training is key to making FAIR and open research practices work at scale in the Earth system sciences. 

How to cite: Henzen, C., Aouadi, N., Brauer, A., Brylka, R., Degbelo, A., Grieb, J., Klammer, R., Konkol, M., Koppe, R., Liu, K., Niers, T., Nüst, D., and Wellmann, A.: From Data Rocks to FAIR Peaks: With NFDI4Earth’s services towards Harmonized Metadata and User-Centered Tools for Earth System Research , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9081, https://doi.org/10.5194/egusphere-egu26-9081, 2026.

During the FAIR-EASE project, which aims to deliver integrated and FAIR-compliant services for Earth and environmental sciences, a key challenge emerged: enabling interoperable and transparent data processing across disciplines and Virtual Research Environments (VREs). Earth system science increasingly relies on complex workflows combining heterogeneous data, models, and tools that often remain confined within  technical silos and domain-specific environments, limiting cross-disciplinary reuse and collaboration.

Galaxy, a widely adopted open-source platform for FAIR data analysis, plays a central role within FAIR-EASE. It provides strong capabilities for sharing, executing, and reproducing scientific workflows. However, while it excels at sharing and executing scientific processes,Galaxy remains difficult to integrate into the broader geospatial ecosystem. Its native API is not aligned with the standards commonly used by geospatial and Earth Observation communities, creating a significant barrier to interoperability with external tools and platforms.

This limitation directly affects the "I" in FAIR (Interoperability) when connecting Galaxy-based VREs to other environments. While the geospatial community, primarily relies on Open Geospatial Consortium (OGC) standards—such as Web Processing Service (WPS) and OGC API Processes—or on community-driven standards like OpenEO, Galaxy exposes its processing through a specific API that is not natively understood outside its ecosystem.  As a result, Galaxy workflows, although FAIR within their own environment, remain partially isolated from standard-based geospatial infrastructures.

To address this gap, Geomatys focused during the FAIR-EASE project on enabling Galaxy workflows to be exposed through widely adopted geospatial standards used by the entire community. Our approach relies on the open-source geospatial platform Examind Community, which acts as a standards-compliant gateway between Galaxy and external clients. By mapping Galaxy workflows to OGC API Processes and WPS, users can discover, configure, and execute Galaxy workflows using widely familiar and widely used geospatial interfaces. This interoperability layer was subsequently extended to support the OpenEO standard, enabling Earth Observation users to access Galaxy workflows through an API increasingly adopted across EO (Earth Observation) community.

To further simplify this ecosystem, we initiated the development of a lightweight bridge based on FastAPI (Python). This micro-service provides a transparent translation layer between the Galaxy API and the OGC Processes or OpenEO APIs. Designed to be modular and easy-to-deploy, it offers a pragmatic solution for institutions wishing to expose their Galaxy instances to the geospatial ecosystem without the overhead of a full-scale infrastructure. 

By normalizing access to Galaxy workflows through standard interfaces, FAIR-EASE demonstrates how VRE-powered access can be achieved in practice. This work significantly broadens the user community of Galaxy and enables researchers to integrate Galaxy-based processing into external tools and workflows. This concrete solution demonstrates how "VRE-powered access" can be achieved by leveraging existing standards to eliminate technical barriers, and our experience highlights that advancing interoperable Earth system science does not require creating new platforms but rather building robust bridges between mature existing tools, such as Galaxy, and the ecosystem of existing geospatial standards.

How to cite: Ginane, D. and Bialota, Q.: Bridging Galaxy and Geospatial Standards: Enabling Interoperable VRE Workflows for Earth System Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9280, https://doi.org/10.5194/egusphere-egu26-9280, 2026.

EGU26-9951 | Posters on site | ESSI3.1

Accelerating Data Discovery: Automated, Scalable Harvesting and Indexing of Metadata Across Heterogeneous Storage Backends 

Christopher Kadow, Martin Bergemann, Mostafa Hadizadeh, Manuel Reis, and Etor Lucio Eceiza

The discoverability and access of climate and Earth-system datasets is foundational for effective scientific analysis workflows, yet these datasets are often hosted across diverse storage systems and follow a variety of organisational conventions. Researchers and infrastructure engineers face challenges in ingesting distributed metadata into unified, searchable catalogues without sacrificing interoperability or scalability. Efficient metadata harvesting, normalisation, and ingestion at scale are therefore critical enablers for data discovery and FAIR (Findable, Accessible, Interoperable, and Reusable) data practices.

To address this need, we present the  Metadata Crawler, a metadata ingestion tool within the designed to automate the collection and indexing of climate dataset metadata across heterogeneous storage backends. The Metadata Crawler supports multi-backend discovery, including POSIX file systems, S3/MinIO object stores, and OpenStack Swift, enabling infrastructure administrators to aggregate metadata from local archives, cloud object storage, and institutional repositories.

 

At its core, the Metadata Crawler implements a two-stage pipeline: harvested metadata are first collected into a temporary catalogue, and then indexed into downstream systems such as Apache Solr or MongoDB. Dataset definitions, directory structures, and extraction logic are governed by a flexible TOML configuration that encodes Data Reference Syntax (DRS) dialects for different standards. Users can make use of pre-defined standards or define their own, making the tool extremely flexible and versatile. This schema-driven approach, combined with path and data specifications, conditional rules, and computed fields, ensures consistent representation of key facets such as temporal coverage, geospatial bounds, and variables and many other metadata specifications.

The tool provides both a command-line interface (CLI) and a Python API, supporting synchronous and asynchronous execution as well as multi-threaded crawling, facilitating integration into operational workflows. By normalising and indexing previously siloed metadata into searchable catalogues, the Metadata Crawler enhances data findability and empowers portals and analysis platforms to deliver efficient discovery services. Its modular design also allows deployment in diverse environments and easy extension to additional backends or indexing targets.

How to cite: Kadow, C., Bergemann, M., Hadizadeh, M., Reis, M., and Lucio Eceiza, E.: Accelerating Data Discovery: Automated, Scalable Harvesting and Indexing of Metadata Across Heterogeneous Storage Backends, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9951, https://doi.org/10.5194/egusphere-egu26-9951, 2026.

EGU26-12439 | Posters on site | ESSI3.1

From questions to expertise: the cross-disciplinary user support network in NFDI4Earth 

Ivonne Anders, Klaus Getzlaff, Sören Lorenz, and Hela Mehrtens

Interdisciplinary research in the Earth system sciences increasingly raises complex questions related to research data management (RDM) that go beyond the scope of local, discipline-specific support services. While institutional RDM support remains the first point of contact for many researchers, cross-cutting issues often require coordinated expertise across institutions and domains.

Within NFDI4Earth, a User Support Network (USN) is being established to address these challenges. At its core, the USN currently consists of a team of support experts from ten partner institutions, all part of the NFDI4Earth consortium. User requests are handled via a central ticket system, allowing coordinated responses and transparent workflows. For more specialised or domain-specific questions, the core team draws on an emerging expert network within NFDI4Earth.

Rather than aiming for a fully distributed support structure from the outset, the USN follows an incremental approach. Existing expertise is consolidated in a central entry point, while the expert network is gradually expanded. A key objective is to strengthen links to local RDM services at partner institutions and to establish closer connections with user support and helpdesk initiatives across other NFDI consortia. In this way, the USN aims to evolve into a well-connected component of a broader NFDI-wide support landscape.

In this contribution, we present the current structure, workflows and first experiences of the NFDI4Earth User Support Network. We also invite researchers to make use of the service for their RDM-related questions and encourage RDM support teams and NFDI initiatives to engage with us, share expertise and help shape a connected, sustainable support landscape for Earth system sciences.

How to cite: Anders, I., Getzlaff, K., Lorenz, S., and Mehrtens, H.: From questions to expertise: the cross-disciplinary user support network in NFDI4Earth, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12439, https://doi.org/10.5194/egusphere-egu26-12439, 2026.

EGU26-12693 | Posters on site | ESSI3.1

DAFNI: Building a VRE for National Infrastructure 

Bethan Perkins, Brian Matthews, Tom Kirkham, and Sarah Byrne

The DAFNI Platform (Data and Analytics Facility for National Infrastructure) is a Virtual Research Environment which stores data and software models, provides an execution environment for those models, and supports data visualisation. Formally launched in July 2021, the DAFNI Platform was built to support national infrastructure research. 

Infrastructure systems supplying water and energy, transport networks, communication networks, and waste management provide the backbone of modern societies and play a key role in the development of nations and communities. The infrastructure research community is diverse, and collaboration between domains is complex, with different development teams using different data and programming standards.  Further, differences in data formats, in spatial and temporal resolution, and in data semantics make it complex to work together and combine models for integrated impact assessments. 

Infrastructure systems cannot, however, be considered in isolation. The interactions between them need to be considered to determine their most effective design and operation. Furthermore, the need for resilience and adaptation to climate change must be examined by all domains. 

To support these heterogeneous research communities working together, the DAFNI Platform was built with flexibility at its core. Containerisation, object stores and high-level metadata vocabularies are some of the key technical aspects to this flexibility, along with a domain-agnostic user interface. When uploading data to the platform, users may upload any file format which is then stored without transformation in an object store. Software models are containerised and uploaded to the DAFNI Platform as Docker images, where they can then be executed on DAFNI as a workflow. Using containerisation, it is possible to combine models together in sequence irrespective of the language that they were written in or the OS on which they were created, allowing researchers to continue using their established practices. 

The decisions to create services which are domain-agnostic have also necessitated certain trade-offs, however, which may not apply to more specialised platforms. For example, while a high-level metadata schema can be applied to e.g. rail timetables as well as it can to flood extent data, it does not support accepted standards or ontologies in either rail or flooding research. Object stores, Docker containers, and neutral user interfaces also come with their own challenges.  

Despite these challenges, however, the DAFNI Platform offers a unique capability and has successfully supported many projects using complex model and data interactions. One prominent example of this is the OpenCLIM project which used the DAFNI Platform to develop workflows linking different human and infrastructure systems to environmental data - such as linking urban development with climate-driven rainfall changes and flooding - and continues to showcase this research on DAFNI beyond project lifetime. 

This presentation will showcase the DAFNI Platform’s functionality, explain key design decisions, and illustrate its impact through examples of research enabled by the platform. We will also reflect on lessons learned in building a VRE for a multidisciplinary domain and discuss implications for future infrastructure research environments. 

How to cite: Perkins, B., Matthews, B., Kirkham, T., and Byrne, S.: DAFNI: Building a VRE for National Infrastructure, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12693, https://doi.org/10.5194/egusphere-egu26-12693, 2026.

EGU26-16308 | Orals | ESSI3.1

GeoSciences IR: a geological data infrastructure for the Italian Network of Regional Geological Surveys 

Luca Guerrieri, Maria Grazia Badas, Pietro Battistoni, Valentina Campo, Carlo Cipolloni, Maria Pia Congi, Chiara D'Ambrogi, Claudia Delfini, Claudio De Luca, Barbara Dessì, Fausto Ferraccioli, Fiorenzo Fumanti, Marco Gerardi, Maurizio Guerra, Gabriele Leoni, Alessandro Maria Michetti, Roberto Passaquieti, Marzia Rizzo, Alessandro Trigila, and Roberta Vigni

In Italy, geological tasks such as the monitoring of geological hazards and the management of georesources are entrusted to the Regional Geological Surveys, each responsible for its own territory. In order to improve and strengthen coordination among the various Regional Geological Services in these activities, the Italian Network of Regional Geological Services (RISG) was established, coordinated by ISPRA – the Geological Survey of Italy.

Within the framework of these activities, the Regional Geological Surveys have highlighted the need for a research infrastructure designed to bridge the gap between academic institutions and operational bodies, in terms of data, services, tools and transfer of knowledge.

Funded by the Italian Ministry of Research through the Recovery Funds programme, GeoSciences IR has been built with this goal by a partnership composed by 13 Universities and 3 research institutions, coordinated by ISPRA: twelve priority specific themes in the geological domain were identified from geological mapping, 3D modelling, marine geology, geoheritage, landslides, sinkholes, design of structural works for risk mitigation, satellite monitoring, active tectonics, georesources and land consumption.

The GeoSciences IR data infrastructure is now open, allowing to access to more than 300 products, including datasets, services, customized viewers, tools, vocabularies, documents, open API and training modules. These latter are available on an e-learning platform built within the research infrastructure.

All these products have been realized based on needs and expectations of Regional Geological Surveys, identified as target users of the infrastructure. The periodic collection of their feedback during the three-years long implementation phase, has allowed to release products well aligned with the requested needs. This shared pathway between the partnership and the target users will extend across the ten-year operational phase of the infrastructure, with a focus on maintaining the data infrastructure and updating the products.

How to cite: Guerrieri, L., Badas, M. G., Battistoni, P., Campo, V., Cipolloni, C., Congi, M. P., D'Ambrogi, C., Delfini, C., De Luca, C., Dessì, B., Ferraccioli, F., Fumanti, F., Gerardi, M., Guerra, M., Leoni, G., Michetti, A. M., Passaquieti, R., Rizzo, M., Trigila, A., and Vigni, R.: GeoSciences IR: a geological data infrastructure for the Italian Network of Regional Geological Surveys, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16308, https://doi.org/10.5194/egusphere-egu26-16308, 2026.

EGU26-17347 | Posters on site | ESSI3.1

From Architecture to Operation: time.IO a User-Centric Digital Ecosystem for Time Series Data Management in Earth System Science 

David Schäfer, Nils Brinckmann, Florian Gransee, Tobias Kuhnert, Ralf Kunkel, Christof Lorenz, Peter Lünenschloß, Bert Palm, Thomas Schnicke, and Jan Bumberger

Research Data Infrastructures (RDIs) in Earth System Science must balance FAIR-compliant data management, operational requirements, and the practical needs of researchers operating heterogeneous sensor networks at scale. These design goals are not always fully aligned and may even conflict in operational environments. At the EGU General Assembly 2025, we introduced a modular digital ecosystem for time series data management designed to address these challenges. One year later, we report on the transition from prototype deployment to sustained operational use and reflect on how user feedback and operational constraints shaped the system’s evolution.

The ecosystem has since been deployed as a production infrastructure at the Helmholtz Centre for Environmental Research - UFZ, where it currently supports approximately 20 research projects. The system manages around three billion observations from diverse sensor networks, with temporal resolutions of up to 5 seconds. This operational setting exposed challenges that were not fully apparent at the design and implementation stages, particularly regarding the scalability of data integration workflows, robustness under continuous load, and the interaction between metadata management, data ingestion, and automated quality control.

The ecosystem comprises three modular components: the Sensor Management System – SMS [1] for standardized metadata registration, the time.IO [2] platform for storage, transfer, and visualization of time series data, and the System for Automated Quality Control – SaQC [3] for automated data analysis and quality assurance. While the modular design enabled reuse and interoperability, early operational phases revealed scaling bottlenecks that led to service outages, necessitating substantial refinements of ingestion pipelines, deployment strategies, and monitoring mechanisms.

User-centric development also played a central role in stabilizing and extending the infrastructure. Continuous feedback from active projects influenced interface design, automation levels, and operational workflows, highlighting the importance of iterative co-design in bridging the gap between conceptual design goals and sustainable, user-accepted operation. We summarize key lessons learned from one year of operational use and discuss implications for building and operating sustainable, interoperable RDIs that effectively support Earth system science across disciplines and scales.

[1] Lorenz, C., Brinckmann, N., Bumberger, J., Hanisch, M., Kuhnert, T., Loup, U., Moorthy, R., Obersteiner, F., Schäfer, D., Schnicke, T. (2025). Sensor Management System (SMS): Open-source software for FAIR sensor metadata management in Earth system sciences. SoftwareX (submitted), https://arxiv.org/abs/2512.17280

[2] Bumberger, J., Abbrent, M., Brinckmann N., Hemmen, J., Kunkel, R., Lorenz, C., Lünenschloß, P., Palm, B., Schnicke, T., Schulz, C., van der Schaaf, H., and Schäfer, D. (2025). Digital Ecosystem for FAIR Time Series Data Management in Environmental System Science. SoftwareX, 102038, https://doi.org/10.1016/j.softx.2025.102038

[3] Schmidt, L., Schäfer, D., Geller, J., Lünenschloss, P., Palm, B., Rinke, K., Rebmann, C., Rode, M., & Bumberger, J. (2023). System for automated Quality Control (SaQC) to enable traceable and reproducible data streams in environmental science. Environmental Modelling & Software, 105809. https://doi.org/10.1016/j.envsoft.2023.105809

How to cite: Schäfer, D., Brinckmann, N., Gransee, F., Kuhnert, T., Kunkel, R., Lorenz, C., Lünenschloß, P., Palm, B., Schnicke, T., and Bumberger, J.: From Architecture to Operation: time.IO a User-Centric Digital Ecosystem for Time Series Data Management in Earth System Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17347, https://doi.org/10.5194/egusphere-egu26-17347, 2026.

EGU26-17776 | Orals | ESSI3.1

Methodology for the Qualitative and Quantitative Analysis of the Environmental Impact of Copernicus Data Space Ecosystem (CDSE) End-Users  

Caroline Ball, Riana Rasoldier, Caroline Vateau, Charlotte Garrel, and Nicolas Estival

The Copernicus Data Space Ecosystem (CDSE) has transformed Earth Observation data access cloud-based, multi-access services, replacing traditional download-heavy approaches. By 2024, CDSE provided over 78 PB of data to more than 290,000 users spanning scientific, institutional, and commercial sectors. This rapid expansion, however, raises sustainability questions due to the environmental impact of data transmission, storage, and processing. This study evaluates end-user contributions to the CDSE's environmental impact and explores strategies to reduce emissions across the value chain.

The approach combines qualitative and quantitative analysis across 3 phases:

Phase 1 – Preparation and Validation

The first phase involves defining the scope of the study, validating user typologies (scientists, industry, institutions, start-ups), confirming methodological standards, and developing tools for surveys and interviews.

Phase 2 – Data collection and user insights: Data will be gathered through 3 complementary channels:

  • A GDPR-compliant questionnaire targeting diverse typologies
  • In-depth discussions to capture decision-making processes and sustainability trade-offs.
  • Existing data sources: Bibliographic research, statistical reports, and operational data from CDSE platforms to quantify download volumes, compute operations, and storage patterns.

Phase 3 – Impact calculation and modelling

Impacts will be evaluated for each CDSE end-user type by examining 3 main areas: data transfer (network traffic), data processing (computing tasks), and data storage (including backup and retention). Both cloud hosting and on-premises systems will be analyzed.

The calculation process commences with the collection of key inputs, which include:

  • Cloud resource consumption: Data from CDSE operations and cloud providers (compute, storage, data transfer).
  • Technical specification of instances used: CPU, GPU, memory, storage type, PUE, etc.
  • User behavior data: Collected via surveys and interviews, considering the amount of data, the types of processing performed, how long data is stored, redundancy measures, and involvement of third parties.

Our methodology is adapted from the Cloud Carbon Footprint approach and established standards for assessing the environmental impact of cloud services and data centers, tailored specifically for Copernicus. This evaluation covers 2 emissions:

  • Embodied Emissions related to manufacturing and maintaining servers and storage devices

Manufacturing emissions × (usage time ÷ lifespan) × (reserved resources ÷ available resources).

  • Operational Emissions caused by resource use during data processing.

Time × total component power × efficiency × electricity carbon intensity.

The calculation uses quantitative data, standard guidelines, established methodologies, and databases such as EcoInvent, EEA, Negaoctet, and Resilio.

Complementing this, a multicriteria LCA approach, following European guidelines and standards, offers a comprehensive view. Key indicators considered are electricity consumption, GHG emissions, primary energy consumption (to capture total energy demand), water consumption (linked to cooling and infrastructure) and abiotic resource depletion (impact of raw material extraction for hardware).

To achieve representative results, sampled typologies are extrapolated to all users using factors such as average footprint, CDSE registration data, infrastructure location, and storage and processing scenarios.

The study will provide:

  • Carbon footprint for each end-user type
  • Hotspot identification in user types and infrastructure
  • Emission reduction recommendations

How to cite: Ball, C., Rasoldier, R., Vateau, C., Garrel, C., and Estival, N.: Methodology for the Qualitative and Quantitative Analysis of the Environmental Impact of Copernicus Data Space Ecosystem (CDSE) End-Users , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17776, https://doi.org/10.5194/egusphere-egu26-17776, 2026.

EGU26-18443 | Orals | ESSI3.1

Operating Open Science Services in Practice - Lessons from the German Specialized Information Service-Infrastructure.  

Melanie Lorenz, Kirsten Elger, Karl Heyer, and Malte Semmler

Open Science is increasingly dependent on collaborative infrastructures that are both discipline-specific and can interoperate across institutional and national boundaries. These infrastructures require coordination mechanisms to balance the characteristics of disciplinary specificity with cross-domain interoperability. In Germany, the Working Group of Specialized Information Services (Arbeitsgemeinschaft der Fachinformationsdienste, AG FID) provides such a collaborative framework by linking discipline-oriented Specialized Information Services (Fachinformationsdienste, FID) into a structured network for exchange, coordination, as well as joint development. In the geosciences, the Specialized Information Service for Geosciences (FID GEO) has supported the research community for almost a decade by providing publication services and consultancy, helping researchers navigate a complex and constantly evolving infrastructure landscape.

FID GEO delivers sustainable publication and data services via established domain repositories. At the same time, FID GEO fosters cultural change through training, community engagement, and active participation in policy and infrastructure development. Collaboration is therefore a cornerstone of FID GEO’s work. It operates in close partnership with geoscientific societies, national infrastructures and initiatives such as the German National Research Data Infrastructure (NFDI). Acknowledging the inherently global nature of the geosciences, FID GEO also aligns its activities with international developments, aiming to synchronize national progress with global standards and best practices for data management and distribution. Acting as an interface between scientists, libraries, repositories and the world of digital data management, FID GEO supports the transformation of the publication culture in the geosciences at national and international levels. These activities, embedded within the AG FID network, clearly benefit from cross-disciplinary exchange, the development of shared standards, and coordinated advocacy. Consequently, their impact is amplified beyond a single community. Specific successes include the increased adoption of FAIR-aligned metadata practices, stronger integration with national infrastructures such as the NFDI, and greater visibility and reusability of geoscientific research outputs.

This contribution provides a critical reflection on the structural challenges shared across the FID system. The ongoing need to adjust to competing funding programs, overlapping infrastructure mandates, and the continuing expectation of “one-stop” platforms/systems means that discipline-specific services must continuously realign their portfolios as responsibilities shift to complementary funding instruments, such as dedicated digitization programs and the NFDI. While this differentiation strengthens the overall research infrastructure ecosystem, it increases the demand for coordination and complicates the long-term maintenance of established services. Rather than striving for monolithic solutions, the FID system demonstrates how distributed services based on the close integration of domain-specific communities attempt to collaborate in finding solutions for interoperable services. These solutions are based on persistent identifiers, shared (metadata) standards, and close stakeholder engagement. This contribution discusses these developments and shares the FID GEO project's experiences with regard to the potentials and challenges of operating open science infrastructures in practice.

How to cite: Lorenz, M., Elger, K., Heyer, K., and Semmler, M.: Operating Open Science Services in Practice - Lessons from the German Specialized Information Service-Infrastructure. , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18443, https://doi.org/10.5194/egusphere-egu26-18443, 2026.

EGU26-18847 | Orals | ESSI3.1

Developing a Blueprint for Open Science in the AuScope Research Infrastructure by Enabling Vertical Integration Across Multiple Levels of Processing. 

Lesley Wyborn, Rebecca Farrington, Tim Rawling, Angus Nixon, Bryant Ware, Jo Croucher, Hannes Hollmann, Nigel Rees, Andrew Robinson, Jens Klump, Alex Hunt, and Sara Polanco

Open Science mandates set a high bar for reproducibility and transparency, requiring open knowledge of all input/output sample and data artefacts. It also requires identification of any actor and tools used to process and model these along the full Research Workflow, starting from acquisition of the Primary Observation Datasets (PODs) and samples, through initial data calibration and then subsequent generations of subsamples and digital outputs. At the same time, compliance with the FAIR and CARE principles, and demands for AI-Ready and Decision-Ready data are also ubiquitous: compliance with all these across multiple levels of processing adds additional complexity to the Open Science Paradigm.

AuScope is Australia’s national geoscience Research Infrastructure (RI) funded through the National Collaborative Research Infrastructure Strategy (NCRIS). AuScope facilities enable the collection of multiple data types, ranging from drone, geophysical and satellite data collections that can be TBs in volume, down to small-scale (MB) long tail collections in geochemistry and geochronology. As articulated in the AuScope Research Data Systems Strategy 2025-2030 (https://doi.org/10.5281/zenodo.15825498), AuScope is committed to Open Science and ensuring compliance with FAIR and CARE. 

Using examples from two AuScope Opportunity Fund projects in geophysics and geochemistry, this paper demonstrates how AuScope is developing a Blueprint for a methodology for Open Science that also establishes compliance with FAIR, CARE, AI-Ready and Decision-Ready requirements at each level of processing. Where possible, all research input and output artifacts are Findable, Accessible, Interoperable and Reusable by machines and CARE is being implemented to complement FAIR and document Indigenous interests and governance. It is planned for AI-Ready data to build on FAIR and CARE with additional metadata on quality, documentation, access and preparation, whilst Decision-Ready guidelines will be implemented through a chain-of-custody approach that allows tracking of any activity, any actor involved and any transformation undertaken.

The first stage in the Blueprint is that for each data type, definitions are agreed for the various levels of processing, starting with PODs through to derivative data products, models and visualisations. Where possible, the NASA processing levels are followed: however, more specific definitions have been created for geochemistry, hyperspectral and magnetotelluric data, with additional definitions planned for other data types. 

Identifiers such as ORCIDs, RORs, RAiD, IGSNs and DOIs are essential at each level of processing to uniquely identify the contributing researchers, research infrastructures, funders, software developers, software etc., and allow connections across each successive level of processing. Identifiers will 1) enhance ways for credit to be given to researchers and funders at each processing level and 2) ensure Indigenous metadata are recorded for each POD and then carried downstream to any derivative product. 

Preliminary results from these AuScope Opportunity Fund projects show that although implementing transparent Open Science is complex, already in geophysics it is allowing researchers to access data at the processing levels most suitable to their research objectives. Ultimately, it is hoped that the transparency enabled in each processing level can contribute to greater trust in solutions proposed for global environmental and social challenges as outlined in the UN Sustainable Development Goals.

How to cite: Wyborn, L., Farrington, R., Rawling, T., Nixon, A., Ware, B., Croucher, J., Hollmann, H., Rees, N., Robinson, A., Klump, J., Hunt, A., and Polanco, S.: Developing a Blueprint for Open Science in the AuScope Research Infrastructure by Enabling Vertical Integration Across Multiple Levels of Processing., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18847, https://doi.org/10.5194/egusphere-egu26-18847, 2026.

EGU26-19120 | Orals | ESSI3.1

GEO In-Situ Data Strategy: understanding and reducing the barriers to re-use of Earth observation data and knowledge 

Helen Glaves, Joan Maso, Leo Chiloane, Paola de Salvo, Kalamkas Yessimkhanova, Felipe Carlos, and Jean-Philippe Aurambout

In-situ data are part of a complementary suite of Earth observations that are vital for monitoring and understanding our planetary system. In contrast to space-based observations, in-situ data are usually direct, ground-based measurements made in specific and often fixed locations. As such, in-situ measurements are likely to be more precise and are widely considered the “ground truth”.

While satellite-based observing systems provide larger-scale systematic coverage of the Earth’s surface, in-situ data is derived from a diverse range of sources that include observing networks, individual sensors, and even citizen scientists, resulting in a highly heterogeneous data landscape. The diverse nature of in-situ data necessitates a considerable investment of resources in its curation and archiving to ensure its usability and accuracy for specific user applications. It also demands significant data management efforts in terms of standardization, harmonization, and interoperability to effectively consolidate different datasets to fulfill the needs of users.

In an effort to address this highly varied landscape of ground-based observations, the Group on Earth Observations (GEO) is launching its In-Situ Data Strategy. The key objectives being to better understand the in-situ data landscape, including identifying and addressing the barriers to making in-situ data open and accessible for wider reuse. This strategy also aims to foster coordination and sustainability of existing observing networks across different geographical areas and domains, which includes identifying critical gaps in these observing systems and advocating for the development of new monitoring networks where necessary.

The GEO In-Situ Data Strategy emphasizes the need for collaboration on a global scale alongside the adoption of common approaches, standards, and best practices for data management, which are essential for integration, interoperability, and reuse of in-situ data. Through its In-situ Data Strategy, GEO aims to foster a coordinated approach to in-situ data management that makes the data open and accessible with the ultimate goal of delivering “Earth Intelligence for All”

This work has been supported by the GEO-IDEA project funded by the European Environmental Agency (EEA)

 

How to cite: Glaves, H., Maso, J., Chiloane, L., de Salvo, P., Yessimkhanova, K., Carlos, F., and Aurambout, J.-P.: GEO In-Situ Data Strategy: understanding and reducing the barriers to re-use of Earth observation data and knowledge, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19120, https://doi.org/10.5194/egusphere-egu26-19120, 2026.

EGU26-20056 | Orals | ESSI3.1

Incentivising open science through powerful free and open tooling 

Jonas Sølvsteen, Aimee Barciauskas, Alex Mandel, Anthony Boyd, Brianna Corremonte, Emmanuel Mathot, Felix Delattre, Kyle Barron, and Pete Gadomski

Development Seed and our partner’s vision is that FAIR (Findable, Accessible, Interoperable, and Reusable) data, a key ingredient to open science, is not an afterthought, but how scientists handle their data in the first place.

Our experience is that appealing, powerful, and free tools for data discovery and access that are readily available incentivise data science practitioners to organise their data in interoperable formats and cataloguing as part of their workflows. 

This talk gives an overview of recent advances in open source tools that Development Seed and our partners are supporting and our experience with their application in platforms such as those powered by the ESA-funded EOEPCA+ software, sister projects NASA/ESA MAAP and NASA VEDA, and the data platform source.coop.

After more than a decade of success with cloud-optimised data formats such as Cloud Optimised GeoTIFF (and recently also Zarr-based variants) and dynamic server-side rendering of data, latest advances focus on client-side access to STAC catalogues in geoparquet format and GPU-powered rendering of web-optimised datasets in the browser, to make the benefits available also in environments where centralised services are not available, and reduce the infrastructure maintenance costs.

How to cite: Sølvsteen, J., Barciauskas, A., Mandel, A., Boyd, A., Corremonte, B., Mathot, E., Delattre, F., Barron, K., and Gadomski, P.: Incentivising open science through powerful free and open tooling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20056, https://doi.org/10.5194/egusphere-egu26-20056, 2026.

EGU26-20419 | Orals | ESSI3.1

OneWater4all platform and ecosystem: where VRE meets FAIR international standards 

Sylvain Grellet, Hervé Squividant, Mario Adam, Fanny Arnaud, Isabelle Braud, Hélène Bressan, Stéphane Debard, Jérôme Fozzani, Véronique Chaffard, Charly Coussot, Kim Anh Trinh, Yvan Le Bras, Eric Lecaudé, Kenneth Maussang, Frédéric Moine, Stéphane Ollagnier, Anne Puissant, Joël Sudre, Lucas Valarcher, and Alexia Vourch and the other OneWater Data project members and associated Water4All partners

In France, the national research and innovation program ‘OneWater - Eau Bien Commun’ (2022-2032) addresses a wide range of key scientific issues to help protect and manage water as a common good. Made up of several projects that will generate a large amount of highly heterogeneous data, ranging from sensor data to social science data, samples and model-based data. It will also draw on data describing the state of water resources produced by observatories, living labs, research infrastructures and national public monitoring services.

Not all data is yet available according to FAIR principles. To process, share and re-use these heterogeneous datasets and ultimately generate new knowledge, the ‘OneWater FAIR Water Platform’ ambitions to go beyond a simple data catalog by fostering a FAIR Water ecosystem based on international standards and implementing semantic web interoperability producing FAIR compliant data by DNA.

The OneWater FAIR platform is fully integrated in the national research data ecosystem on earth and its environment, which relies on the DataTerra Research Infrastructure and its data hubs such as Theia/OZCAR. Collaboration with the services supporting national public policy data and associated monitoring networks is being organized. At the international level, connection with the community is established so that the OneWater initiative can contribute to and benefit from the FAIR Water community. This includes the OGC Hydrology Domain Working Group (OGC HydroDWG), WMO, UN bodies (UNEP, UNESCO IGRAC), DANUBIUS, eLTER RIs, TERENO and the Water4all partnership amongst others.

The OneWater data project does not only address data and information technology needs but is also committed to supporting the water community through an ecosystem building on open international standards, their open-source implementation and resource people. It also builds on a national overview of water data use practices, tools and obstacles for both researchers and operational stakeholders to reach FAIR. This will allow to train and support the community so that the tools traditionally used evolve towards FAIR practices.

This communication will present the approach being implemented in the OneWater Data Platform project and its first results.

1/ The definition of how to reach a high FAIRness level within the water community in the light of the existing international standards and best practices (OGC, W3C, INSPIRE, RDA) with the target to produce FAIR Implementation Profiles (FIP).

2/ The produced FAIRness analysis templates for various numerical resources and their application on datasets from the community progressively enriching the FAIRness of the overall ecosystem.

3/ OneWater's FAIR platform mixing

-   well-known and most recent interoperability standards and best practices and their open-source implementations.

-          with the community proven Virtual Research Environment (VRE) lGalaxy and Jupyter Notebook

4/ End-to-end use cases that are already implemented and the methodology to include new ones

5/ The training programme being set up

How to cite: Grellet, S., Squividant, H., Adam, M., Arnaud, F., Braud, I., Bressan, H., Debard, S., Fozzani, J., Chaffard, V., Coussot, C., Trinh, K. A., Le Bras, Y., Lecaudé, E., Maussang, K., Moine, F., Ollagnier, S., Puissant, A., Sudre, J., Valarcher, L., and Vourch, A. and the other OneWater Data project members and associated Water4All partners: OneWater4all platform and ecosystem: where VRE meets FAIR international standards, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20419, https://doi.org/10.5194/egusphere-egu26-20419, 2026.

EGU26-20634 | ECS | Orals | ESSI3.1

Why traceability matters for operational GHG and tracer datasets: lessons for collaborative platforms 

Dafina Kikaj, Matt Rigby, Joe Pitt, Grant Forster, Kieran Stanley, Ed Chung, Chris Rennick, Dickon Young, Angelina Wenger, Penelope Pickers, Emmal Safi, Karina Adcock, Tom Gardiner, and Simon O’Doherty

Achieving global climate goals requires more than scientific insight, it needs trusted, operational evidence on greenhouse gas (GHG) emissions and how they change as mitigation is put in place. That evidence increasingly comes from combining measurements from many sites and networks, often together with models and inventories. The challenge is not only measuring well, but delivering data that stay consistent over time, are comparable between sites, and are ready for routine use by different communities. Climate action therefore needs an operational level of data: regular releases with clear metadata and uncertainty information.

A key part of this is traceability. Traceability means being able to answer simple questions about every value in a dataset: How was it measured? How was it calibrated? What corrections and quality checks were applied? Which software produced it? What does the uncertainty mean? This becomes especially important over time, because instruments, calibrations, and processing methods evolve, and users need to understand what changed and why.

A practical blueprint will be presented for running traceable GHG and related tracer datasets at scale, based on the day-to-day experience of a large team of measurement scientists, data specialists, and modellers. The blueprint is built around tiered data releases, where products are published at different data levels (raw → quality controlled → derived products), each with uncertainty information appropriate to that level and clear links between levels. A recorded history of processing and version changes is maintained for every release, together with harmonised metadata and uncertainty fields so both people and machines can interpret the data in the same way. Practical operational tools are discussed, such as automated checks, written decision rules, routine reprocessing, and release practices that support stable identifiers and proper credit.

Examples using tracer-based diagnostics, with radon as one example, show how good traceability enables routine, reproducible products that can be used directly in modelling and emissions workflows. The contribution closes with lessons learned on how to keep this working in practice, including coordination, shared standards, and training across teams.

How to cite: Kikaj, D., Rigby, M., Pitt, J., Forster, G., Stanley, K., Chung, E., Rennick, C., Young, D., Wenger, A., Pickers, P., Safi, E., Adcock, K., Gardiner, T., and O’Doherty, S.: Why traceability matters for operational GHG and tracer datasets: lessons for collaborative platforms, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20634, https://doi.org/10.5194/egusphere-egu26-20634, 2026.

EGU26-21771 | Posters on site | ESSI3.1

Reliable and reproducible Earth System Model data analysis with ESMValTool 

Valeriu Predoi, Bouwe Andela, and Birgit Hassler

ESMValTool is a software tool for analyzing data produced by Earth System Models (ESMs) in a reliable and reproducible way. It provides a large and diverse collection of “recipes” that reproduce standard, as well as state-of-the-art analyses. ESMValTool can be used for tasks ranging from monitoring continuously running ESM simulations to analysis for scientific publications such as the IPCC reports, including reproducing results from previously published scientific articles as well as allowing scientists to produce new analysis results. To make ESMValTool a user-friendly community tool suitable for doing open science, it adheres to the FAIR principles for research software. It is: - Findable - it is published in community registries, such as https://research-software-directory.org/software/esmvaltool; - Accessible - it can be installed from Python package community distribution channels such as conda-forge, and the open-source code is available on Zenodo with a DOI, and on GitHub; - Interoperable - it is based on standards: it works with data that follows CF Conventions and the Coupled Model Intercomparison Project (CMIP) Data Request, its reusable recipes are written in YAML, and provenance is recorded in the W3C PROV format. It supports diagnostics written in a number of programming language, with Python and R being best supported. Its source code follows the standards and best practices for the respective programming languages; - Reusable - it provides a well documented recipe format and Python API that allow reusing previous analyses and building new analysis with previously developed components. Also, the software can be installed from conda-forge and DockerHub and can be tailored by installing from source from GitHub. In terms of input data, ESMValTool integrates well with the Earth System Grid Federation (ESGF) infrastructure. It can find, download and access data from across the federation, and has access to large pools of observational datasets. ESMValTool is built around two key scientific software metrics: scalability and user friendliness. An important aspect of user friendliness is reliability. ESMValTool is built on top of the Dask library to allow scalable and distributed computing, ESMValTool also uses parallelism at a higher level in the stack, so that jobs can be distributed on any standard High Performance Computing (HPC) facility; and software reliability and reproducibility - our main strategy to ensure reliability is modular, integrated, and tested design. This comes back at various levels of the tool. We try to separate commonly used functionality from “one off” code, and make sure that commonly used functionality is covered by unit and integration tests, while we rely on regression testing for everything else. We also use comprehensive end-to-end testing for all our “recipes” before we release new versions. Our testing infrastructure ranges from basic unit tests to tools that smartly handle various file formats, and use image comparison algorithms to compare figures. This greatly reduces the need for ‘human testing’, allowing for built-in robustness through modularity, and a testing strategy that has been tailored to match the technical skills of its contributors.

How to cite: Predoi, V., Andela, B., and Hassler, B.: Reliable and reproducible Earth System Model data analysis with ESMValTool, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21771, https://doi.org/10.5194/egusphere-egu26-21771, 2026.

EGU26-22048 | Posters on site | ESSI3.1

Virtual Research Environment initiatives as part of ODATIS, the French Ocean data cluster 

Cyril Germineaud, Gwenael Caer, and Jean-François Piollé

As part of the French Ocean data cluster ODATIS (from the Data Terra Research Infrastructure), we will showcase the Virtual Research Enviroment (VRE) tools and services offered by CNES and Ifremer. In particular, we will present the CNES JupyterHub platform for hosting projects (high computing power with CPU and GPU capacities, very fast and optimized remote access to data products, etc.) together with altimetry specific Pangeo-based libraries, powerful tools, dedicated tutorials to illustrate simple use cases (intercomparison with different satellite data, cyclone monitoring, coastal water quality applications, etc.) and a technical support (helpdesk) for smooth sailing on the platform. In addition, the synergy between satellite and in-situ data will be also illustrated for several applications, such as surface currents, and comparisons between (BGC-)Argo profiling float observations and satellite matchups.

How to cite: Germineaud, C., Caer, G., and Piollé, J.-F.: Virtual Research Environment initiatives as part of ODATIS, the French Ocean data cluster, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22048, https://doi.org/10.5194/egusphere-egu26-22048, 2026.

EGU26-22238 | Posters on site | ESSI3.1

SciX: Scaling Research Discovery and Collaboration Across Earth andSpace Science Infrastructures 

Suze Kundu, Anna Kelbert, Jennifer Lynn Bartlett, and Alberto Accomazzi

Earth and space scientists increasingly work across disciplinary, institutional, and national boundaries, drawing on diverse data sources, tools, and communities. While research data infrastructures (RDIs) have made significant progress in enabling access to data, researchers still face fragmentation across platforms, uneven user experiences, and barriers to interdisciplinary discovery and collaboration.

SciX is NASA’s evolving research discovery and collaboration platform, building on the long-established success of the Astrophysics Data System (ADS) to serve the full breadth of NASA’s Science Mission Directorate. In this contribution, we describe SciX as a user-centred research infrastructure that connects people, research outputs, funding opportunities, and Open Science resources across Earth, planetary, heliophysics, and astrophysics communities. Rather than replacing domain-specific data centres, SciX aims to complement existing infrastructures by improving discoverability, interoperability at the metadata and knowledge level, and cross-disciplinary navigation of the research landscape.

We reflect on the opportunities and challenges of scaling an infrastructure with a strong disciplinary identity (ADS) into a broader, transdisciplinary platform. This includes balancing community-specific needs with shared services, supporting FAIR and Open Science practices in ways that are meaningful to researchers, and fostering cultural change through community engagement rather than top-down mandates. Drawing on early use cases and community feedback, we discuss how SciX is addressing user needs, sustainability, and governance while enabling new forms of interdisciplinary connection.

We conclude by outlining lessons learned for the design of sustainable RDIs and invite dialogue with the Earth System Science community on how infrastructures like SciX can better support collaborative, open, and societally relevant research across domains.

How to cite: Kundu, S., Kelbert, A., Bartlett, J. L., and Accomazzi, A.: SciX: Scaling Research Discovery and Collaboration Across Earth andSpace Science Infrastructures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22238, https://doi.org/10.5194/egusphere-egu26-22238, 2026.

EGU26-22886 * | Orals | ESSI3.1 | Highlight

GeoFAIR - All are Welcome! 

Shelley Stall, Danielle Kinkade, and Natalie Raia
Innovation within the scientific enterprise is maximized when researchers are supported with tools, guidance, and infrastructure, including data that are as open and FAIR as possible. In recognition of this fact, funders, institutions, and scholarly publishers are imposing increasing expectations for sharing research data and software. Researchers responding to these requirements face conflicting workflows, timing, and a myriad of data archiving choices; they are unknowingly caught in a “FAIR data crisis”. Additionally, researchers don’t yet have trust that these same archive choices could be their first stop in finding new and interesting datasets. To unleash transformative research of the future, vetted disciplinary-specific science support frameworks are needed for both archive deposition as well as discovery of new datasets. 
 

This presentation will introduce a newly funded project aimed at building a sustainable community resource for three disciplines to drive their community toward a shared vision of common research data resources, methods and tools that are grounded in Open Science and FAIR data principles. Each discipline-specific framework will coalesce existing resources and efforts, and through adoption, deliver: 1) Consolidated, vetted community resources identified in partnership with respective members; 2) Interoperable data that are machine-actionable supporting discovery, trust, and reuse; 3) A discipline-specific leadership and sustainable governance that intentionally fosters development of data management skills, Open Science, and FAIR data. Thus, this work will realize the value of Open Science practices by putting community-vetted resources at the heart of where researchers share their research and connect to their colleagues - society communities and meetings.

How to cite: Stall, S., Kinkade, D., and Raia, N.: GeoFAIR - All are Welcome!, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22886, https://doi.org/10.5194/egusphere-egu26-22886, 2026.

EGU26-1564 | Posters on site | ESSI3.2

Managing your drone data through the data life cycle: RDA guidelines for FAIR and responsible UAV Use 

Alice Fremand, Jens Klump, Sarah Manthorpe, Mari Whitelaw, France Gerard, Wendy Garland, Charles George, and Thabo Semong

The use of Remotely Piloted Aerial Systems (RPAS), also referenced as Uncrewed Aerial Vehicles (UAVs) and more generally as drones, is increasingly prevalent across various scientific disciplines, enabling the collection of large volumes of data for diverse research applications. These technologies are revolutionising data collection by offering higher temporal and spatial resolutions and enabling data collection in hazardous and inaccessible areas. However, the volume of data generated and the absence of standardised workflows to document operations and data processing often complicate data sharing and publication. 

As part of the Research Data Alliance (RDA) Small Uncrewed Aircraft and Autonomous Platforms Data Working Group, we have developed guidelines on how best to improve the Findability, Accessibility, Interoperability and Reusability (FAIR, Wilkinson et al. 2016) of these data and processing workflows. The working group compiled use cases showcasing RPAS applications across various research disciplines, documenting best practices and identifying gaps and challenges researchers have while handling their RPAS-derived data. We paid specific attention to legal, privacy and ethical considerations. Drawing on these insights, the group has now developed guidelines and recommendations to improve RPAS data management throughout the research life cycle, from mission planning to data publication and archiving, linking to existing resources and examples from the scientific community.

How to cite: Fremand, A., Klump, J., Manthorpe, S., Whitelaw, M., Gerard, F., Garland, W., George, C., and Semong, T.: Managing your drone data through the data life cycle: RDA guidelines for FAIR and responsible UAV Use, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1564, https://doi.org/10.5194/egusphere-egu26-1564, 2026.

EGU26-3136 | Posters on site | ESSI3.2

Data Access Made Easy: flexible, on the fly data standardization and processing 

Mathias Bavay, Patrick Leibersperger, and Øystein Godøy

Automatic Weather Stations (AWS) deployed in the context of research projects provide very valuable point data thanks to the flexibility they offer in term of measured meteorological parameters and setup. However this flexibility is a challenge in terms of metadata and data management. Traditional approaches based on networks of standard stations struggle to accommodate these needs, leading to wasted data periods because of difficult data reuse, low reactivity in identifying potential measurement problems, and lack of metadata to document what happened.

The Data Access Made Easy (DAME) effort is our answer to these challenges. At its core, it relies on the mature and flexible open source MeteoIO meteorological pre-processing library. Originally developed for the needs of numerical models consuming meteorological data it has expanded as a data standardization engine for the Global Cryosphere Watch (GCW) of the World Meteorological Organization (WMO). For each AWS, a single configuration file describes how to read and parse the data, defines a mapping between the available fields and a set of standardized names and provides relevant Attribute Conventions Dataset Discovery (ACDD) metadata fields. Low level data editing is also available, such as excluding a given sensor, swapping sensors or merging data from another AWS, for any given time period. Moreover an arbitrary number of filters can be applied on each meteorological parameter, restricted to specific time periods if required. This allows to describe the whole history of an AWS within a single configuration file and to deliver a single, consistent, standardized output file possibly spanning many years, many input data files and many changes both in format and available sensors.

Through the EU project Arctic Passion, a web interface has been developed that allows data owners to manage the configuration files for their stations, refresh their data at regular intervals, inspect the data QA log files, receive notification emails and allow on-demand data generation. The same interface allows other users to request data on-demand for any time period.

How to cite: Bavay, M., Leibersperger, P., and Godøy, Ø.: Data Access Made Easy: flexible, on the fly data standardization and processing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3136, https://doi.org/10.5194/egusphere-egu26-3136, 2026.

In recent years, significant progress has been made in digitizing natural history collections using increasingly industrialized workflows involving conveyor belts, digital camera setups, robotics and Artificial Intelligence (AI). Also, new technologies became available to analyse the specimens. Analysis of both biodiversity and geodiversity samples has shifted from destructive analysis to non-destructive, high-resolution, and automated techniques accelerating the creation of new information.However, the resulting data is often fragmented across systems and repositories. Efforts to reconnect these data to the original specimen or derived samples frequently fail because identifiers were missing at the time of analysis, are not globally unique, change over time, or are referenced incorrectly. These issues can be solved by maintaining a digital object on the internet that is created at the time of collecting the sample, which contains contextual information and (links to) its derived data as this becomes available. This is called a Digital Specimen and different entities(human or machine) who create an analysis can add information to the digital object. A one-to-one relationship between the physical sample preserved as a specimen can be kept by giving the physical objecta persistent identifier like an IGSN, International Generic Sample Number. The digital object also gets a persistent identifier: a Digital Specimen identifier in the form of a FAIR Digital Object compliant DOI (Digital Object Identifier).

The Digital Specimen is a citable, machine-actionable proxy for physical specimens that is FAIR by design (FAIR Digital Object compliant) and has a Persistent Identifier (PID) in the form of a DOI to create a self-contained unit of knowledge. This design enables seamless linkage to derived data—such as chemical analysis, digital media, and publications. To implement this, DiSSCo (Distributed System of Scientific Collections) developed the open Digital Specimen (openDS) specification. By integrating community standards like Darwin Core with W3C PROV-O and JSON-LD, openDS provides a common semantic language for global interoperability.

DiSSCo is currently in transition from its project phase into becoming an operational European Research infrastructure. It has already created the first millions of FDO-compliant Digital Specimens and has developed infrastructure to allow the annotation of these digital objects with new data or improvements, either by humans or machines. AI fueled Machine Annotation Services (MAS) developed by third parties can operate in the infrastructure for analysis of the data or knowledge extraction from specimen images. 

In the presentation we will show how the FDO design supports advanced capabilities like multiple redirect to different digital representations for either human or machine, versioning and provenance to allow mutable objects, tooltips in journal systems that show contextual information about a referred sample in a publication through the PID record, and machine actionable metadata that supports machines to act on the data.

How to cite: Addink, W. and Islam, S.: DiSSCo's Vision Applied: (Re-)connecting Fragmented Specimen Data through FAIR Digital Objects, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3293, https://doi.org/10.5194/egusphere-egu26-3293, 2026.

EGU26-4891 | ECS | Posters on site | ESSI3.2

A FAIR Protocol for Hybrid Models and Data in Hydrology 

Akash Koppa, Son Pham-Ba, Felix Bauer, Olivier Bonte, Oscar Baez-Villanueva, Reda El Ghawi, Alexander Winkler, Diego G. Miralles, Fabrizio Fenicia, Charlotte Gisèle Weil, and Sara Bonetti

Hybrid modeling, which integrates physics-based and machine learning (ML) components, is a growing research area in hydrology and the broader Earth Science community. By combining the interpretability of process-based models with the predictive power of data-driven algorithms, these hybrid architectures offer improved accuracy and representation of complex environmental processes. However, their adoption is currently constrained by significant challenges regarding FAIR principles (Findable, Accessible, Interoperable, Reusable) . Unlike traditional physics-based models, the reusability of hybrid systems is frequently hindered by the dynamic nature of ML components, which are inextricably linked to specific training datasets and hyperparameter configurations. Furthermore, existing data data and model repositories are rarely designed to host such models.

To address these systemic barriers, we collaboratively designed and implemented a standardized FAIR protocol specifically tailored for hydrological hybrid models. This framework, termed as FRAME, consists of three critical components: (a) a set of interoperability coding standards for the physics and ML modules, (b) a unified metadata specification that captures the disparate requirements of both physics-based parameters and ML architectures, and (c) a specialized online repository designed for the persistent hosting and sharing of integrated hybrid assets. To facilitate user adoption, we developed an associated command line interface (CLI) for automated retrieval and setup of these models. To ensure the long-term impact and scalability of this protocol, we are actively soliciting participation from the global hydrologic modeling community. By establishing a community-driven standard, this protocol aims to provide a robust foundation for the transparent, reproducible, and collaborative advancement of hybrid modeling in hydrology.

How to cite: Koppa, A., Pham-Ba, S., Bauer, F., Bonte, O., Baez-Villanueva, O., El Ghawi, R., Winkler, A., G. Miralles, D., Fenicia, F., Gisèle Weil, C., and Bonetti, S.: A FAIR Protocol for Hybrid Models and Data in Hydrology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4891, https://doi.org/10.5194/egusphere-egu26-4891, 2026.

EGU26-5023 | Orals | ESSI3.2

Using “Data Agreements” in universities to clarify research data rights of use 

María Piquer-Rodríguez, Esther Asef, Sophia Reitzug, and Andreas Hübner

The Earth, Space, and Environmental Sciences are research disciplines in which a large amount of research data is generated and in which the principles of FAIR and open data are now receiving considerable attention.

Both FAIR and open data aim to enable and enhance the reusability of data, but before research data can be made available for broad reuse, it is essential to clarify rights and permissions: who is authorized to share the data and with whom, who may publish it, how credit for data-related work will be attributed, and what arrangements apply if a researcher transfers to another institution.

Concrete regulation of usage rights for research data continues to pose major challenges for researchers and research institutions alike. There are legal uncertainties due to room for interpretation in the general legal requirements, and in many cases, there are no systematised workflows for defining usage rights. To close this gap, a working group at the Department of Earth Sciences at Freie Universität Berlin has developed and implemented a ‘Data Agreement’ that provides clarity on the exercise of usage rights to research data within the group (for students and researchers) and also helps to operationalise FAIR and CARE principles in everyday research practice.

The ‘Data Agreements’ are used as an opportunity to discuss expectations regarding data management and to define and agree on binding rights of use for research data with each new member of the group or student´s thesis projects. We present the key aspects of the ‘Data Agreements’ and report on practical experiences with their use. We show how it not only facilitates clear agreements and prevent subsequent disagreements. In addition to legal aspects, practical aspects such as backup strategies or storage locations can also be specified within this process and thus improve the data management practice within the group.

The ‘Data Agreements’ [1] were developed in the working group together with the Research Data Management team and the university's legal office and are available under CC0 for reuse in other research groups or institutions. While the agreements were developed within a university context and relate to German academic practice and law, they may be reused or serve as templates for other research institutions, in other national or international contexts, and over a wide variety of Earth, Space, and Environmental Sciences disciplines and beyond.

[1] http://dx.doi.org/10.17169/refubium-46356

How to cite: Piquer-Rodríguez, M., Asef, E., Reitzug, S., and Hübner, A.: Using “Data Agreements” in universities to clarify research data rights of use, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5023, https://doi.org/10.5194/egusphere-egu26-5023, 2026.

AQUARIUS is an ongoing Horizon Europe funded project. An impressive range of 57 research infrastructure services is made available by Transnational Access (TA) Calls to include research vessels, mobile marine observation platforms, fixed marine facilities, experimental research facilities, river & basin supersites, aircraft, drones, satellite services, and sophisticated data infrastructures.

As a result of the TA projects, many new data sets in a large variety of data types are being collected by TA teams, using and combining multiple and different observation installations. A major aim of AQUARIUS is supporting the EU Mission to Restore our Ocean and waters by 2030, and other marine initiatives, including contributing to the European Digital Twin of the Ocean and the UN Decade for Ocean Sciences.

There is a strong effort in AQUARIUS to get the maximum return of investment from the TA activities. An open data policy has been adopted, implemented with a dedicated Data Management approach, to ensure that all gathered metadata and data are managed in line with the FAIR principles. They should become part of the repositories managed and operated by leading European data management infrastructures, such as SeaDataNet, EurOBIS, ELIXIR-ENA, ICOS-Ocean, and Copernicus INSTAC, for quality assurance, long term stewardship, and wide access and use. These infrastructures in turn are feeding into EMODnet, Copernicus Marine, Blue-Cloud (EOSC), Digital Twin of the Ocean (DTO) developments, and globally to e.g. GEOSS, and the UN-IOC Ocean Decade programme.

To achieve a maximum result, the TA scientific teams are being supported by data centres, experienced in marine data management, and well connected to the European data management infrastructures. Most of them are National Oceanographic Data Centres (NODCs). They provide training and coach the TA teams during the AQUARIUS data management flow scheme. This includes steps from planning to training to deployment to publishing, and a number of instruments. One of those is the AQUARIUS TA Data Summary Log App which is used by PIs of TA projects to keep an overview and index of the data collection events. It produces a list for the data centres to know what data to expect from where and who and as a checklist for the next steps. The AQUARIUS TA Data Summary Log contains only metadata and no data. As follow-up, the TA teams and assigned data centres will work on elaborating the collected data to prevailing standards and inclusion in the European repositories. That progress is made visible through the AQUARIUS Dataflow Dashboard (ADD), integrated in the AQUARIUS website. It follows the progress from planning stage through to publishing of results for each awarded TA project. The ultimate goal is to give discovery and public access to research data sets as collected and processed and data products as generated by the TA research teams as part of the AQUARIUS TA projects.

The presentation will provide more background information on the AQUARIUS project and will highlight more details about the data management approach.

How to cite: Ni Chonghaile, B., Schaap, D., and Fitzgerald, A.: AQUARIUS, Integrating Research Infrastructures, Connecting Scientists, and Enabling Transnational Access for Healthy and Sustainable Marine and Freshwater Ecosystems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5326, https://doi.org/10.5194/egusphere-egu26-5326, 2026.

SeaDataNet is a major pan-European infrastructure for managing and providing access to marine data sets, acquired by European organisations from research cruises and other observational activities in European coastal marine waters, regional seas and the global ocean. Founding partners are National Oceanographic Data Centres (NODCs), and major marine research institutes. The SeaDataNet network gradually expanded its network of data centres and infrastructure, during a series of dedicated EU RTD projects, and by engaging as core data management infrastructure and network in leading European Commission initiatives such as the European Marine Observation and Data network (EMODnet), Copernicus Marine Service (CMS), and the European Open Science Cloud (EOSC).

SeaDataNet develops, governs and promotes common standards, vocabularies, software tools, and services for marine data management, which are widely adopted. A core service is the CDI data discovery and access service which provides online unified discovery and access to vast resources of data sets, managed by 115+ connected SeaDataNet data centres from 34 countries around European seas, both from research and monitoring organisations. Currently, it gives access to more than 3 Million data sets, originating from 1000+ organisations in Europe, covering physical, geological, chemical, biological and geophysical data, acquired in European waters and global oceans. Standard metadata and data formats are used, supported by an ever-increasing set of controlled vocabularies, resulting in rich and highly FAIR metadata and data sets. SeaDataNet provides core services in EMODnet Chemistry, Bathymetry, and Physics for bringing together and harmonizing large amounts of marine data sets, which are used by EMODnet groups for generating thematic data products.

EMODnet Bathymetry is active since 2008 and maintains a Digital Terrain Model (DTM) for the European seas. This is published every 2 years, each time extending coverage, and improving quality and precision. The DTMs are produced from surveys and aggregated data sets that are referenced with metadata via the SeaDataNet Catalogue services. Bathymetric survey data sets are gathered and populated by national hydrographic services, marine research institutes, and companies in the SeaDataNet CDI Data Discovery & Access service. Currently, this amounts to more than 45.000 datasets from 78 data providers. A major selection of these datasets has been used for preparing the 2024 release of the EMODnet DTM for all European waters and Caribbean, which has been published on the EMODnet portal. Currently, work is ongoing for a new 2026 version. 

The EMODnet DTM has a grid resolution of 1/16 * 1/16 arc minutes (circa 115 * 115 m), covering all European seas. It is based upon circa 22.000+ in situ datasets. It can be downloaded in tiles and viewed as map layers in the EMODnet portal. The maps are derived from EMODnet Bathymetry OGC WMS, WMTS, and WFS services. The EMODnet Bathymetry products are very popular and in 2024 – 2025 more than 100.000 EMODnet DTM files were downloaded, and more than 60 million OGC service requests were registered over the 2 years. EMODnet Bathymetry is also managing the European contribution to the international Seabed 2030 project.

How to cite: Schaap, D. M. A., Scory, S., Piel, S., and Schmitt, T.: SeaDataNet, pan-European infrastructure for marine and ocean data management and major pillar under EMODnet Bathymetry for generating the best Digital Bathymetry for European Seas   , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5581, https://doi.org/10.5194/egusphere-egu26-5581, 2026.

EGU26-5601 | ECS | Orals | ESSI3.2

Providing analysis-ready campaign data via the InterPlanetary File System 

Lukas Kluft and Tobias Kölling

During field campaigns, timely data sharing across distributed teams is essential, yet access to central repositories is often constrained by limited bandwidth. As a result, preliminary datasets are frequently exchanged offline, which commonly leads to confusion about dataset versions once post-campaign releases occur.

We present a proof-of-concept to campaign data dissemination based on content-addressable storage. During the ORCESTRA campaign, observations were converted into analysis-ready Zarr stores and published via the InterPlanetary File System (IPFS). By accessing data through immutable content identifiers (CIDs), teams can use datasets offline in the field while ensuring that the exact same, verifiable data objects remain accessible after the campaign.

To improve discoverability and usability, we developed the ORCESTRA Data Browser, which dynamically generates dataset landing pages by fetching metadata client-side directly from IPFS. Together, these components demonstrate how decentralized, content-addressed data access can support version clarity, reproducibility, and robust data sharing for field campaigns and beyond.

How to cite: Kluft, L. and Kölling, T.: Providing analysis-ready campaign data via the InterPlanetary File System, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5601, https://doi.org/10.5194/egusphere-egu26-5601, 2026.

EGU26-5819 | Posters on site | ESSI3.2

Making FAIRness Visible: Practical FAIR Assessment for Earth System Science Data 

Heinrich Widmann, Andrea Lammert, Eileen Hertwig, Beate Krüss, Karsten Peters-von Gehlen, and Hannes Thiemann

The FAIR-by-design approach pursued by most repositories and data services today requires significant and sustained effort in the curation and quality assurance of both data and metadata. Beyond providing research data that complies with the FAIR principles, it is essential that the level of FAIRness is transparently apparent to users from the metadata prior to data access and download. FAIRness indicators benefit both data providers and reusers by rewarding high-quality curation and supporting informed data selection in  complex, data-intensive Earth System Science (ESS) workflows.

In practice, making FAIRness levels visible requires repository data managers to perform  FAIR evaluation, either through manual assessment or by using established FAIR assessment tools. At the World Data Center for Climate (WDCC) the fully automated F-UJI tool is applied in operational practice to assess and expose FAIRness levels across large collections of climate data.

F-UJI is a web based service that programmatically assess FAIRness of research data objects at the dataset level based on the FAIRsFAIR Data Object Assessment Metrics. Its   automated and machine-aided analytics are well suited for the large amounts of datasets archived in WDCC and reflect established repository practices such as the assignment of DataCite DOIs and the provision of rich, standardised metadata. At the same time, automated assessment relies on clearly machine-assessable criteria, and thus can not fully capture FAIR aspects that require human interpretation, such as reuse relevance or domain-specific semantics. In addition, FAIRness results depend on the machine-detectability of persistent identifiers resolving directly to datasets, which are not always available at higher levels of data collection hierarchies.

Based on our operational experience, we compare F-UJI results with other FAIR assessment approaches, building on findings from a previous comparative study evaluating FAIR assessment methods for WDCC datasets (Peters-von Gehlen et al., 2022). This comparison shows that automated, manual, and hybrid FAIR evaluation approaches each have distinct strengths: automated methods focus on standardised, machine-actionable criteria, while manual assessments capture contextual aspects relevant for data reuse; hybrid approaches combine these advantages and mitigate the limitations of purely automated or manual methods.

This poster shares practical experiences from conducting operational FAIRness assessment at a climate data repository and discusses benefits, limitations, and best practices of automated and hybrid FAIR evaluation approaches in Earth System Science.

How to cite: Widmann, H., Lammert, A., Hertwig, E., Krüss, B., Peters-von Gehlen, K., and Thiemann, H.: Making FAIRness Visible: Practical FAIR Assessment for Earth System Science Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5819, https://doi.org/10.5194/egusphere-egu26-5819, 2026.

EGU26-6870 | ECS | Orals | ESSI3.2

FAIRness and Openness Commitments as a catalyst for cultural change in research organisations 

Daniel Nüst, Anne Sennhenn, Jörg Seegert, Andreas Hübner, Khabat Vahabi, Stephan Hachinger, Markus Möller, Carsten Hoffmann, Lars Bernard, James M. Anderson, Sarah Fischer, Markus Reichstein, Mélanie Weynants, Carsten Keßler, Katharina Koch, Klaus-Peter Wenz, Nicole van Dam, and Babette Regierer

Many research communities and disciplines undergo a transformation towards promoting, facilitating, and recognising FAIRness (Wilkinson et al., 2016; https://doi.org/10.1038/sdata.2016.18) and Openness in Research Data Management (RDM) practices. These transformations require buy-in from stakeholders at multiple levels and warrant many conversations between all roles to be sustainable. One approach to facilitate  and document the requested stakeholders’ ownership is the use of so-called commitments, where public endorsements by individuals or organisations serve as a driver to normalize desirable practices and offerings. Commitments can establish a community norm, whose practices may eventually turn into standards, requirements and guarantees.

The Earth System Sciences (ESS) consortium of the German Research Data Infrastructure (NFDI) programme, NFDI4Earth (https://nfdi4earth.de/), and the NFDI consortium for the agrosystems research community, FAIRagro (https://fairagro.net), take deliberate steps to initialize cultural change in the form of commitments. The NFDI4Earth and FAIRagro FAIRness and Openness Commitments (https://doi.org/10.5281/zenodo.10123880, published in September 2024; https://doi.org/10.5281/zenodo.14925202 from February 2025) help to start conversations about changing the way that research data is collected, created, published, used, and recognised and request institutions to engage in the implementation and operation of FAIR RDM and related services. The signature of members and representatives of the respective communities signals agreement with the goals and values of the Commitments and with the consortias’ missions, products, and services. The signatories build a community of practice that takes into account diverse expertises, roles, and user groups for a sustainable shift towards more and diversified FAIR research outputs, and increasing adoption of Open Science and Open Research principles and practices.

The Commitments consist of two matching main statements and twelve supporting statements. The main statements are: (1) We commit to advance FAIRness and Openness in Earth System Science/Agricultural Sciences and beyond. (2) We value data infrastructures and data experts. The supporting statements concretise the engagement and give starting points for the implementation. Changes in the supporting statements enabled FAIRagro to incorporate community-specific aspects in its adoption of the NFDI4Earth Commitment. The NFDI4Earth and FAIRagro commitments have 8 and 7 institutional signatories, respectively, and 70 and 54 group or individual signatories, correspondingly (https://nfdi4earth.de/commitment, https://fairagro.net/en/commitment/).

In this work, we present the two Commitments and recap the process for their creation (cf. https://doi.org/10.5194/egusphere-egu23-14456), their differences, and lessons learned. We report on the interactions sparked by the Commitments with community stakeholders. We focus on the role of organisations and groups, because they are crucial to implement cultural change: they can set requirements, provide incentives for their members, and match these with supporting services and infrastructures. Specifically, we report from an exchange of experiences between representatives of institutional and group signatories from a workshop that connected institutions, created a space for open exchange, and laid a foundation for generalisable approaches.

How to cite: Nüst, D., Sennhenn, A., Seegert, J., Hübner, A., Vahabi, K., Hachinger, S., Möller, M., Hoffmann, C., Bernard, L., Anderson, J. M., Fischer, S., Reichstein, M., Weynants, M., Keßler, C., Koch, K., Wenz, K.-P., van Dam, N., and Regierer, B.: FAIRness and Openness Commitments as a catalyst for cultural change in research organisations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6870, https://doi.org/10.5194/egusphere-egu26-6870, 2026.

EGU26-7107 | ECS | Posters on site | ESSI3.2

Automated workflows for ever-growing, analysis-ready datasets at the Barbados Cloud Observatory 

Rowan Orlijan-Rhyne, Lukas Kluft, and Tobias Kölling

The Barbados Cloud Observatory (BCO), in continuous operation by the Max Planck Institute for Meteorology, offers an extensive record of clouds in the trade wind region since its birth in 2010. In the form of public, analysis-ready zarr stores processed with automated workflows, the record can be studied at time scales from seconds to years and serves to drive theoretical and model advancements. As an important geoscientific research asset, data from the BCO is trustable, reproducible, and versioned, but also easily available.

BCO data processing employs Apache Airflow’s automated workflows which append to zarr stores whenever new data arrives. Management of dynamic and growing datasets—as opposed to static (e.g. campaign) datasets—permits many versions, all of which are accurate and can be automatically regenerated. In shepherding the data, we choose our own unique keys, including dataset version numbering, which make up an intake catalog. We also implement quality control of dataset metadata and encodings with in-house tools.

By allowing for rolling processing of the data, often at daily intervals, our products can be easily probed for scientific, technical, and other use. For instance, we develop a javascript viewer which allows users to quickly and easily visualize data from many instruments. Additionally, by providing raw (i.e. directly from the instrument, as format permits), time-aggregated, commonly gridded, and sitewide 'best estimate' datasets, we also iterate on levels of processing complexity for a host of needs. These usability advantages are consequences of our technical approach, namely automated workflows and analysis-ready zarr stores.

How to cite: Orlijan-Rhyne, R., Kluft, L., and Kölling, T.: Automated workflows for ever-growing, analysis-ready datasets at the Barbados Cloud Observatory, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7107, https://doi.org/10.5194/egusphere-egu26-7107, 2026.

EGU26-7717 | Orals | ESSI3.2

Integrating biodiversity in Situ data, Earth observation and stakeholder engagement - from machine- to policy-actionability 

Claus Weiland, Lena Perzlmaier, Daniel Bauer, Jonas Grieb, Julian Oeser, Taimur Khan, Sharif Islam, and Niels Raes

The EU’s Biodiversity Strategy for 2030, a core part of the European Green Deal, addresses the complex relationship between human society and its environment by prioritizing the restoration of ecosystems and building resilience against climate change, deforestation, and biodiversity loss.

These environmental stressors do more than just degrade ecosystems; they create a pressing need for policymakers, researchers, and society to actively track and mitigate ecological shifts. In order to design effective mitigation strategies, new political frameworks and massive simulation infrastructures are being developed with the aim to establish a common European Green Deal Data Space. The involved initiatives rely on the integration and standardization of diverse, large-scale datasets, ranging from long-term biodiversity records (e.g., eDNA) to real-time IoT sensor data (e.g., camera traps) and global Earth observation (EO) data combined with model-derived reanalysis datasets like ERA5.

‘Biodiversity Meets Data’ (BMD) is a Horizon Europe project delivering a unified access point for AI-assisted biodiversity monitoring and cross-realm (terrestrial, marine, freshwater) analysis tools representing a key contribution to the thematic expansion of the European Green Deal Data Space ecosystem. By providing a robust technical infrastructure, BMD facilitates the quantification of diverse ecological pressures - ranging from climate change to land-use shifts - on biodiversity. The project is strategically focused on the EU Natura 2000 network, equipping stakeholders such as conservation managers and policy makers with the necessary tools to implement and evaluate EU Nature Directives such as the Birds and Habitats Directives.

In this talk, we will present how BMD leverages FAIR Digital Objects (FDOs) and data space concepts around governance, licensing, and provenance tracking to synthesize computational workflows and diverse datasets into actionable knowledge units (“Workflow Run RO-Crate”, Figure 1). We will demonstrate our implementation path for such data-rich, self-contained digital containers building on web-based technologies such as RO-Crate (lightweight data packages) and FAIR Signposting (machine-interpretable layer describing resources). Those webby FDOs are designed to bridge the gap between practical needs of conservation stakeholders such as supporting data-driven decision making and technical capabilities of the Green Deal Data Space ecosystem.

Integration of targeted feedback from stakeholders, notably Natura 2000 site managers, into our development process ensures that the FAIR-compliant data products and FDO service framework are not only technically robust, but also socially and politically actionable.

 

Figure 1. Throughout its life cycle in the BMD data space, data is represented as RO-Crate. Initially (left), the data and the computational workflow are bundled as Workflow RO-Crate. Following processing, this is combined with the results and enriched with retrospective provenance and metadata to form a Workflow Run RO-Crate (right). Finally, these are presented as webby FAIR Digital Objects, incorporating a machine-interpretable layer based on FAIR Signposting (bottom).

 

How to cite: Weiland, C., Perzlmaier, L., Bauer, D., Grieb, J., Oeser, J., Khan, T., Islam, S., and Raes, N.: Integrating biodiversity in Situ data, Earth observation and stakeholder engagement - from machine- to policy-actionability, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7717, https://doi.org/10.5194/egusphere-egu26-7717, 2026.

EGU26-7771 | ECS | Orals | ESSI3.2

Standardizing and encouraging best practices in tephra sample and data collection 

Abigail Nalesnik, Kristi Wallace, Andrei Kurbatov, Kerstin Lehnert, and Stephen Kuehn

The tephra research community spans diverse disciplines—from volcanology to archaeology—but faces persistent challenges due to fragmented databases and limited data accessibility. To address these issues, the global tephra community has developed best practices for standardized data collection and reporting, documented in Wallace et al. (2022; zenodo.org/records/6568306). These guidelines and templates for physical and geochemical datasets promote FAIR principles by improving data consistency, discoverability, and interoperability. Implementing these practices can significantly enhance multidisciplinary research and foster collaboration.

To advance data discovery and accessibility, the tephra community has partnered with the Interdisciplinary Earth Data Alliance (IEDA²) to create the Tephra Information Portal (TIP). TIP serves as an integrated framework that connects tephra data from existing cyberinfrastructures—such as EarthChem, PetDB, GeoDIVA, SESAR, TephraBase, and StraboSpot—allowing users to search across tephra platforms using common criteria, enhancing data findability and reuse. Standardized data submissions to these platforms are therefore critical for improving the findability of samples and datasets through TIP, and their adoption is strongly encouraged by the tephra community.

How to cite: Nalesnik, A., Wallace, K., Kurbatov, A., Lehnert, K., and Kuehn, S.: Standardizing and encouraging best practices in tephra sample and data collection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7771, https://doi.org/10.5194/egusphere-egu26-7771, 2026.

EGU26-7777 | Posters on site | ESSI3.2

Reproducible, transparent and traceable cleaning of IOC Tide Gauge Data 

Thomas Saillour and Panagiotis Mavrogiorgos

Accurate tide gauge records are essential for coastal monitoring, sea level analysis, and the calibration and validation of numerical models. However, global sea level data providers such as the Intergovernmental Oceanographic Commission (IOC)1 often contain inconsistencies related to vertical datums, step changes, sensor noise, and undocumented interventions, which limit their direct applicability for modelling and validation purposes.

We present ioc_cleanup (github.com/oceanmodeling/ioc_cleanup) , an open-source Python repository designed to clean tide gauge time series using a reproducible and transparent workflow defined in structured JSON files. All transformations are traceable, version-controlled using Git, allowing for consistent quality control, peer-review and community-driven improvements. The framework explicitly addresses common data quality issues, including spikes, sensor noise, sensor replacement or substitution, and step changes, as well as the challenge of distinguishing bad data from genuine physical events such as storm-driven sea level extremes or tsunamis.

The cleaned datasets have been used for the calibration and validation of a global barotropic model, revealing systematic data quality patterns across stations and regions. While the framework is applied here to sea level data, the methodology is provider-agnostic and applicable to other geophysical time series.

By formalising expert-driven flagging and corrections in a transparent manner, ioc_cleanup provides a foundation for future developments, including the potential use of machine learning techniques to assist data flagging, reduce operator subjectivity, and extend spatial and temporal coverage. The framework offers a scalable contribution to other datasets (such as GESLA42) and supports reproducible coastal data curation.

Citations:
[1] Flanders Marine Institute (VLIZ); Intergovernmental Oceanographic Commission (IOC) (2025): Sea level station monitoring facility. Accessed at https://www.ioc-sealevelmonitoring.org/ on 2025-12-15 at VLIZ. DOI: 10.14284/482

[2] Haigh, I.D., Marcos, M., Talke, S.A., Woodworth, P.L., Hunter, J.R. & Hague, B.S. et al. (2023) GESLA Version 3: A major update to the global higher-frequency sea-level dataset. Geoscience Data Journal, 10, 293–314. Available from: https://doi.org/10.1002/gdj3.174

How to cite: Saillour, T. and Mavrogiorgos, P.: Reproducible, transparent and traceable cleaning of IOC Tide Gauge Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7777, https://doi.org/10.5194/egusphere-egu26-7777, 2026.

EGU26-9152 | Posters on site | ESSI3.2

istSOS4Things - FAIR & Open Source IoT platform for Open Science 

Massimiliano Cannata, Daniele Strigaro, and Claudio Primerano

Sensor-based environmental monitoring is increasingly vital for research and decision-making, yet the current web standards used to share these data streams, such as the OGC SensorThings API (STA), do not fully support scientific reproducibility, data provenance, or data sovereignty. To meet reproducibility requirements, researchers often resort to downloading and archiving static snapshots of evolving time-series datasets, leading to unnecessary data duplication, loss of linkage with live sources, and inefficient data management.

IstSOS4Things (www.istsos.org) aims to close this critical gap by extending the STA standard with versioning and time-travel capabilities, enabling data auditing and persistent, immutable access to historical states of sensor observations through persistent URL. Much like Git allows access to past versions of code, the proposed STA-traveltime extension let users cite, query and extract the exact dataset used in a study, even years later.

This breakthrough addresses a long-standing limitation of geospatial web services and paves the way for fully FAIR (Findable, Accessible, Interoperable, Reusable) and reproducible research. In parallel, istSOS4Things introduces mechanisms for fine-grained access control embedded within the web service itself, empowering researchers and institutions to share their data in accordance with the principle of “as open as possible, as closed as necessary.” This helps overcome common hesitations for data sharing, ensuring trust, transparency, and legal compliance.

How to cite: Cannata, M., Strigaro, D., and Primerano, C.: istSOS4Things - FAIR & Open Source IoT platform for Open Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9152, https://doi.org/10.5194/egusphere-egu26-9152, 2026.

EGU26-9158 | Orals | ESSI3.2

Making STAC FDO-ready: A Practical Path toward FAIR Digital Objects in Geoscientific Data Spaces 

Hannes Thiemann, Ivonne Anders, Marco Kulueke, Beate Kruess, and Karsten Peters-von Gehlen

FAIR Digital Objects (FDOs) provide an actionable framework for implementing the FAIR principles by combining persistent identifiers with machine-readable metadata, explicit typing, and structured relations. The FDO Forum, as an open, community-driven initiative, develops and coordinates specifications and reference concepts to support interoperable digital objects across infrastructures. A key challenge, however, is demonstrating how these specifications can be applied in practice within existing data ecosystems, where established domain standards and evolving collections must be integrated rather than replaced.

In this contribution, a practical implementation of FDO specifications is presented using the SpatioTemporal Asset Catalog (STAC) as an example. As a widely adopted standard for spatio-temporal data, STAC's modular design makes it an ideal bridge between established community practices and the FDO paradigm. The demonstration shows how STAC objects are transformed into typed FDOs using Handle-based PIDs and registered object types via a Data Type Registry (DTR). This approach enables machine-actiolnable navigation and interpretation that transcends domain-specific tooling.

The approach is illustrated using a STAC-based catalog developed at the German Climate Computing Center (DKRZ), reflecting typical characteristics of climate research and climate modelling data, such as evolving and versioned collections and multiple levels of aggregation. The focus is on the practical application of FDO specifications, illustrating how typing, identifiers, and relations can be introduced in a standards-compliant manner without disrupting existing infrastructures, while enabling stable referencing, automated discovery, and seamless integration into data-processing workflows.

The results show that implementing FDO specifications through STAC is a pragmatic and transferable pathway from specification-level concepts to operational adoption. The implementation enables the creation of interoperable, machine-actionable data spaces while building on established standards and tooling, and provides lessons learned for other infrastructures aiming to operationalize FAIR Digital Objects in practice.

How to cite: Thiemann, H., Anders, I., Kulueke, M., Kruess, B., and Peters-von Gehlen, K.: Making STAC FDO-ready: A Practical Path toward FAIR Digital Objects in Geoscientific Data Spaces, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9158, https://doi.org/10.5194/egusphere-egu26-9158, 2026.

EGU26-9202 | Posters on site | ESSI3.2

PID-Driven Global Access to Flagship km-scale Climate Simulation Data 

Karsten Peters-von Gehlen, Kameswar Rao Modali, Florian Ziemen, Martin Bergemann, Christopher Kadow, Karl-Hermann Wieners, Siddhant Tibrewal, Ivonne Anders, Katharina Berger, Tobias Kölling, Lukas Kluft, Marco Kulüke, and Fabian Wachsmann

Climate science enterprise both produces and depends on extremely large datasets in order to meet the needs of diverse scientific and downstream user communities, especially as climate models are increasingly run at kilometre-scale resolutions, resulting in rapidly growing data volumes which increase demands on data handling infrastructures. Individual flagship simulations are no longer used by a single research group, but are routinely reused by dozens or even hundreds of researchers globally. Consequently, data findability, accessibility and reuse must be straightforward, data provenance must be transparent, and the full heritage of simulation data should be preserved in a machine-actionable manner to ensure scientific rigour, explainability and reproducibility.

In this contribution, we present a conceptual infrastructure-level approach developed within the WarmWorld project based on leveraging the versatility of globally unique persistent identifiers (PIDs) to address these challenges. Specifically, we illustrate that by assigning handles to simulation datasets already at the point of production, simulation data stored locally at a HPC data center can become part of a globally interoperable data ecosystem. In our concept, handle profiles contain an URL at which the dataset can be opened. Further, machine-actionable metadata, such as the detailed provenance information describing the employed model configuration or a data reuse license and citation, would be available from the handle landing page. Thus, the motivation behind the approach we follow here is akin to that of the FDO specifications.

Finalized simulation datasets would be exposed through globally accessible SpatioTemporal Asset Catalogs (STAC), where PIDs serve as the authoritative entry point for discovery and access. Data access would be handled by system libraries that resolve storage locations across heterogeneous storage tiers. Crucially, data access shall be designed to be globally open without the need for credentials, reflecting a strong demand from the climate research community, as clearly demonstrated during the WCRP kilometre-scale hackathon (May 2025).

Systematic assignment and pragmatic leveraging of handles assigned to locally stored datasets can thus enable scalable and interoperable access to flagship climate datasets across infrastructures and communities, effectively integrating traditionally closed HPC data environments into the global data space and facilitating interoperability with other large-scale data holdings.

How to cite: Peters-von Gehlen, K., Modali, K. R., Ziemen, F., Bergemann, M., Kadow, C., Wieners, K.-H., Tibrewal, S., Anders, I., Berger, K., Kölling, T., Kluft, L., Kulüke, M., and Wachsmann, F.: PID-Driven Global Access to Flagship km-scale Climate Simulation Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9202, https://doi.org/10.5194/egusphere-egu26-9202, 2026.

EGU26-9425 | ECS | Orals | ESSI3.2

Advancing FAIR Digital Objects for Machine-Actionable Research: Integrating Semantic Enrichment in Research Object Ecosystems 

Adam Rynkiewicz, Raul Palma, Paulina Poniatowska-Rynkiewicz, and Malgorzata Wolniewicz

Achieving higher levels of FAIR-ness for research artefacts demands not only structured packaging but also semantic enrichment that links textual resources to knowledge bases. ROHub, a reference platform implementing the Research Object paradigm, enables scientists to package and share research outputs as structured Research Object Crates (RO-Crates) - combining data, methods, software, and associated metadata into a unified, machine-processable entity. 

While RO-Crates inherently improve metadata richness and FAIR compliance by aggregating diverse resources with persistent identifiers and schema-based annotations, many research outputs still contain unlinked textual artefacts (e.g., reports, questionnaires, narratives) whose contextual semantics remain underutilized. Manual semantic annotation to link these textual elements to external knowledge bases - such as domain ontologies or vocabularies - is time-consuming and error-prone, yet crucial for enhancing findability, semantic interoperability, and machine-actionability. 

To address this gap, we extend ROHub with an automated semantic annotation service that identifies entities within text resources and links them to relevant knowledge bases, producing enriched metadata that feeds back into the RO-Crate structure. This service integrates entity linking techniques to reduce manual curation overhead and systematically increase the FAIRness and discoverability of research objects - making them more accountable to machine discovery, integration, and automated workflows. The result is a FAIR research object ecosystem where textual content, semantic context, and structured metadata co-exist in a machine-processable form, enhancing both human and computational reuse.

How to cite: Rynkiewicz, A., Palma, R., Poniatowska-Rynkiewicz, P., and Wolniewicz, M.: Advancing FAIR Digital Objects for Machine-Actionable Research: Integrating Semantic Enrichment in Research Object Ecosystems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9425, https://doi.org/10.5194/egusphere-egu26-9425, 2026.

EGU26-10308 | ECS | Posters on site | ESSI3.2

Interoperable CSV for Environmental Data Archival and Exchange– iCSV 

Patrick Leibersperger, Mathias Bavay, Ionut Iosifescu Enescu, and Chase Núñez

Environmental research relies on seamless data exchange between institutions globally, but inade-
quate documentation and complex formats hinder collaboration. We introduce iCSV, a self-describing,
human-readable format that combines the simplicity of CSV with the metadata richness of NetCD-
F/CF. iCSV ensures long-term interpretability, interoperability and user accessibility, addressing
key challenges in environmental data stewardship. By embedding structured metadata directly in a
human-readable text file, iCSV enables automated validation, supports FAIR principles and lowers
the barrier to data sharing and reuse while ensuring data remains interpretable for future users and
maintaining broad compatibility with existing software. This work motivates the need for a simple,
self-describing tabular format for environmental time series, presents the iCSV specification, positions
it within existing binary and human-readable format ecosystems through comparative analysis, and
discusses current limitations with directions for future improvements.

How to cite: Leibersperger, P., Bavay, M., Enescu, I. I., and Núñez, C.: Interoperable CSV for Environmental Data Archival and Exchange– iCSV, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10308, https://doi.org/10.5194/egusphere-egu26-10308, 2026.

EGU26-10344 | Orals | ESSI3.2

FAIR Assessment in Geo-INQUIRE: Lessons Learned from Two Years of Experience 

Otto Lange, Laurens Samshuijzen, Enoc Martinez, Javier Quinteros, Helle Pedersen, Angelo Strollo, Carine Bruyninx, Florian Haslinger, Marc Urvois, Danciu LAurentiu, and Anna Miglio

The Geo-INQUIRE* project concerns an initiative in which, in a cross-domain setting, the European ESFRI landmark environmental research infrastructures EPOS, EMSO, ECCSEL, the Center of Excellence for Exascale Computing ChEESE, and the ARISE infrasound community, exploit innovative techniques to meet their FAIR data ambitions. At EGU25 we informed the audience about the project’s data management objectives and the strategies that were applied to translate the abstract concept FAIRness into practices that could widely be adopted in a large heterogeneous landscape of data producers. Specifically, we demonstrated how we established a pipeline for the assessment of levels of FAIRness with the integration of the F-UJI tool. This Geo-INQUIRE FAIRness Assessment Pipeline (GiFAP) is in use now for a period of about two years, in which it has proven to be a valuable instrument for the ongoing evaluation of the FAIRness of multiple datasets over time. However, interpreting and comparing snapshots of the value collections is by no means trivial and must be managed and communicated with care.

Because the integration of an assessment tool like F-UJI at the time always involves the adoption of a solution which itself is under active development and as such can hinder the reproducibility of outcomes, special care must be taken with respect to the versions used of both the tool itself and the underlying metrics framework. It is also essential to understand the effect of choices made during repeated assessment across time on the FAIR scores and their subsequent interpretation. The practical use of the overall pipeline as a tool to guide improvements in the FAIRness of data, mainly by adapting and improving the metadata, has revealed valuable insights in the subtleties of applying the FAIR data concept in different communities and to different data types.

As an important real-world example of applying the FAIR concept in a complex dynamic data-lifecycle setting we will explain how we technically integrated the F-UJI instrument in the existing infrastructure. A special focus will be put on possible pitfalls and their solutions regarding versioning issues that naturally arise when comparisons will be made over a longer period of time. The importance of managing expectations, the dependency on data managers, and the interference with applications for long tail researchers will be discussed and we will explain how we covered these within the project. Finally, we will explain how the Geo-INQUIRE solution could be adopted for comparable scenarios. 

* Geo-INQUIRE is funded by the European Union (GA 101058518)



How to cite: Lange, O., Samshuijzen, L., Martinez, E., Quinteros, J., Pedersen, H., Strollo, A., Bruyninx, C., Haslinger, F., Urvois, M., LAurentiu, D., and Miglio, A.: FAIR Assessment in Geo-INQUIRE: Lessons Learned from Two Years of Experience, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10344, https://doi.org/10.5194/egusphere-egu26-10344, 2026.

EGU26-11300 | Posters on site | ESSI3.2

FAIR-compliant infrastructure based on istSOS for high-resolution rainfall monitoring and alerting 

Daniele Strigaro, Massimiliano Cannata, Claudio Primerano, and Andrea Salvetti

In Switzerland, short-duration and spatially concentrated rainfall events increasingly affect small catchments, where limited response times can lead to flash floods and debris flows with significant impacts on local infrastructure. These phenomena typically develop at spatial and temporal scales that are not fully captured by conventional meteorological monitoring networks.

Recent events in Southern Switzerland, including in the municipality of Lumino, have shown how localized precipitation can rapidly overload drainage systems and watercourses. Such situations highlight the need for rainfall observations with higher spatial density and minute-scale temporal resolution, able to complement regional forecasting and warning services.

National early warning systems, including those provided by MeteoSwiss, form a key component of flood risk management but may not resolve precipitation variability at local scales. To complement these systems, SUPSI and the Canton Ticino’s Ufficio dei corsi d’acqua (UCA) are testing a denser rainfall monitoring network based on rain gauges delivering one-minute data streams in near real time.

The monitoring infrastructure is designed according to FAIR data principles, ensuring that observations are findable, accessible, interoperable, and reusable. Data are managed through a cloud-based, event-driven architecture built on open geospatial standards, notably the OGC SensorThings API, implemented using the istSOS framework. Incoming data streams are processed on a computing cluster to derive cumulative rainfall indicators at multiple temporal scales (10-minute, hourly, and three-hourly), which are used to support threshold-based alerting mechanisms.

By combining high-resolution observations with open, standards-based data services, the system enables real-time visualization, automated notifications, and seamless integration with existing hydrological and risk management workflows. This approach demonstrates how FAIR-by-design monitoring infrastructures can bridge the gap between regional forecasts and local-scale observations, strengthening early warning capabilities and supporting more resilient flood risk management in a changing climate.

How to cite: Strigaro, D., Cannata, M., Primerano, C., and Salvetti, A.: FAIR-compliant infrastructure based on istSOS for high-resolution rainfall monitoring and alerting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11300, https://doi.org/10.5194/egusphere-egu26-11300, 2026.

EGU26-11768 | Posters on site | ESSI3.2

A Cloud-Native GNSS Data Lakehouse for Scalable Ingestion, Processing, and Analysis 

Nils Brinckmann and Markus Bradke

The rapid growth of Global Navigation Satellite System (GNSS) observations, driven by dense station networks, high-rate data streams, and the modernisation of satellite constellations places increasing demands on data centers in terms of scalability, reliability, and reproducibility. Traditional monolithic GNSS data management systems are often difficult to scale and adapt to evolving processing and analysis workflows. To address these challenges, we are developing a cloud-native GNSS data center architecture based on container orchestration and streaming technologies.

Our system is built on Kubernetes to enable flexible deployment, horizontal scalability, and fault tolerance of GNSS services. Data ingestion is handled through Apache Kafka, which provides a robust, high-throughput messaging backbone for streaming GNSS observations from heterogeneous sources. This approach decouples data producers and consumers, allowing independent scaling of ingestion, processing, and downstream analytics.

For long-term storage and analytical access, GNSS data are ingested via ETL pipelines into an Apache Iceberg data lakehouse. Iceberg provides schema evolution, partition management, and ACID (Atomicity, Consistency, Isolation, and Durability) guarantees, enabling efficient access to large, time-series GNSS datasets for both batch and interactive analysis.

System performance, data flow, and service health are continuously monitored using Prometheus, with operational and scientific metrics visualized through Grafana dashboards. This monitoring framework facilitates operational stability, performance optimization, and transparent reporting of data latency and availability.

We present the overall system design, implementation details, and initial performance results, and discuss how this architecture improves scalability, resilience, and reproducibility compared to conventional GNSS data centers. The proposed approach provides a flexible foundation for next-generation GNSS services and can be extended to other geodetic and Earth observation data streams.

How to cite: Brinckmann, N. and Bradke, M.: A Cloud-Native GNSS Data Lakehouse for Scalable Ingestion, Processing, and Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11768, https://doi.org/10.5194/egusphere-egu26-11768, 2026.

EGU26-13109 | ECS | Orals | ESSI3.2

A harmonized, modular data quality framework facilitating cross-disciplinary usage and time-efficient evaluation of geospatial data 

Barbara Riedler, Sophia Klaußner, Stefan Lang, and Khizer Zakir

The increasing availability of spatial data coupled with the utilization of artificial intelligence, makes it essential to focus on the evaluation of data quality. At the same time, the fragmentation of existing quality frameworks hinders the attainment of comparable assessment results. We introduce a novel, modular framework for the evaluation of geospatial data quality with particular emphasis on FAIRness, transferability, reusability and spatial consistency. The framework thereby accommodates data of differing data processing levels, types and contexts. The hierarchical structure integrates common quality dimensions (e.g., completeness, accuracy, consistency) with new dimensions emphasizing upstream validity (metadata, traceability of input data, reproducibility) and downstream usability (applicability, transferability). Additionally, the framework enables the evaluation of two interlinked concepts: general data quality (DQ) and data adequacy (DA). The latter incorporates the relevance of data and the fit to use case-specific requirements. DQ and DA are measured through a combination of machine-evaluable metrics and structured expert judgment, aggregated as indicators on dimension and domain level. The assessment protocol is implemented in form of a spreadsheet and a web-based survey tool. The overall objectives of this development are (1) to achieve harmonization of existing quality concepts to facilitate cross-disciplinary data integration; (2) to support data selection processes in geospatial applications which involve multiple data sources and/or time-critical situations, through the reusability of evaluation results; and (3) to leverage the reflected data usage and integration into operational workflows through the consideration of spatial uncertainties and the implementation of aspects of FAIRness.

How to cite: Riedler, B., Klaußner, S., Lang, S., and Zakir, K.: A harmonized, modular data quality framework facilitating cross-disciplinary usage and time-efficient evaluation of geospatial data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13109, https://doi.org/10.5194/egusphere-egu26-13109, 2026.

EGU26-13278 | Orals | ESSI3.2

Is FAIR Sufficient for Interactive Data Services? Ensuring Sustainability and Reliability of the IPCC WGI Interactive Atlas 

Martina Stockhause, José Manuel Gutiérrez, Ezequiel Cimadevilla Alvarez, Maialen Iturbide, Lina Sitz, and Antonio S. Cofiño

The FAIR principles — Findable, Accessible, Interoperable, and Reusable — underpin Open Science but does not fully ensure the long-term usability of interactive data services like the IPCC WGI Interactive Atlas. Drawing on lessons learned from developing and operating the Interactive Atlas, this presentation explores the challenges of sustaining such services, which rely not only on FAIR-compliant data and software but also on continuous stewardship, infrastructure maintenance, and institutional commitment.

Scientific quality and transparency of the Interactive Atlas are supported through expert assessment by the IPCC authors, provenance documentation, and Complex Citation, which combine the attribution of credit for assessed digital objects with the traceability of digital IPCC results. Yet, sustaining reliability requires ongoing stewardship of both data and software to prevent degradation and preserve reproducibility. Addressing these needs demands joint efforts of the IPCC Data Distribution Centre (DDC) Partners to maintain data, documentation, and interactive components for a diverse user community. FAIR alone is not enough — long-term data preservation and infrastructure maintenance are essential to ensure the sustainability and trustworthiness of interactive data services in Earth system science.

By reflecting on both the successes and limitations of the Interactive Atlas, this contribution offers insights relevant to other Earth system science communities developing interactive or service-oriented data products. These approaches are also applicable to fields beyond Earth system science. 

How to cite: Stockhause, M., Gutiérrez, J. M., Cimadevilla Alvarez, E., Iturbide, M., Sitz, L., and Cofiño, A. S.: Is FAIR Sufficient for Interactive Data Services? Ensuring Sustainability and Reliability of the IPCC WGI Interactive Atlas, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13278, https://doi.org/10.5194/egusphere-egu26-13278, 2026.

EGU26-13673 | Orals | ESSI3.2

Bridging fragmented terminologies: advancing vocabulary harmonization in Seismology through AI and community co-creation 

Juliano Ramanantsoa, Angelo Strollo, Florian Haslinger, Javier Quinteros, Daniele Bailo, Otto Lange, Samshuijzen Laurens, Sven Peter Naesholm, and Mathilde B. Sørensen

The conceptual clarity of any scientific field depends fundamentally on the precision and standardisation of its terminology. Prior studies have shown that an absence of standardized terminologies can lead to interpretive ambiguity, imprecise outputs, and divergent interpretations across research communities. In seismology, terminologies remain scattered across institutional glossaries, impeding data FAIRness (Findability, Accessibility, Interoperability, and Reusability), metadata consistency, and collaboration with adjacent fields such as  transdisciplinary research and AI engineering.

This work, carried out within the Geo-INQUIRE* project, introduces a vocabulary generation framework and a prototype database implementing three integrated innovations that consolidate the sparse seismological terminologies into a structured, machine-readable format: i) authority-first retrieval, ii) AI-mediated semantic triangulation, and iii) participatory expert governance.

The authority-first pathway performs weighted, priority-ranked extraction from eight expert-curated data centre sources (including FDSN, USGS, EarthScope, EPOS, and other relevant documents from the community), ensuring that the definitions originate from trusted references. The AI fallback pathway is activated only when authoritative retrieval fails, employing a semantic triangulation method in which three large language models - such as OpenAI's GPT-5.2, Anthropic's Claude Opus 4.5, and Google's Gemini 3 - independently generate candidate definitions. Embedding-based similarity analysis determines synthesis eligibility; if cross-model agreement falls below 50 percent, an expert flag is raised to prevent semantic uncertainty. When synthesis proceeds, a transparent concept-merging process extracts common and unique contributions from each model, recording all reasoning steps and preserving full provenance, overcoming a critical limitation of black-box AI knowledge generation.

Beyond technical generation, this work embeds vocabulary development within a participatory framework that transforms terminology from static definitions into community-validated knowledge. Through structured digital deliberation involving more than ten domain experts via a GitHub-based workflow, the approach delivers transparency, auditability, and collective ownership. Experts validate AI-retrieved content, resolve edge cases, and steward terminology evolution through documented discussion threads, ensuring definitions reflect both institutional authority and practitioner consensus while fostering public trust in seismology.

The system produces vocabulary encoding scheme-compliant entries with dual definitions: an authoritative version weighted by source priority, and an AI-synthesized alternative with full provenance. The source-weighting mechanism is fully flexible ensuring the reusability of the framework. Applied to over 500 terms across 4 thematic clusters, this framework demonstrates that AI can systematically extend vocabulary completeness while participatory governance safeguards epistemic integrity. By coupling algorithmic precision with community oversight, this framework strengthens data discovery, metadata coherence, and research infrastructure interoperability across European and international seismological networks that advance transparent, reproducible, and interoperable seismological science.

*Geo-INQUIRE (Geosphere INfrastructures for QUestions into Integrated REsearch) is funded by the European Union (GA 101058518).

 

 

How to cite: Ramanantsoa, J., Strollo, A., Haslinger, F., Quinteros, J., Bailo, D., Lange, O., Laurens, S., Naesholm, S. P., and Sørensen, M. B.: Bridging fragmented terminologies: advancing vocabulary harmonization in Seismology through AI and community co-creation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13673, https://doi.org/10.5194/egusphere-egu26-13673, 2026.

EGU26-14101 | ECS | Orals | ESSI3.2

Embedding Indigenous Data Governance in Research Data Infrastructures through Local Contexts 

Sarvenaz Ghafourian, Sean Tippett, and Chantel Ridsdale

Indigenous Data Sovereignty reflects the inherent rights of Indigenous Peoples to govern data relating to their communities, lands, and knowledge, while Indigenous Data Governance concerns how these rights are enacted within data systems. Translating this into practice within large-scale environmental data infrastructures remains a challenge.

Ocean Networks Canada (ONC) hosts long-term, near real-time coastal and oceanographic datasets that are widely reused across research, operational, and increasingly automated and machine-assisted workflows. In this context, ensuring that Indigenous governance expectations are clearly communicated and respected throughout the data lifecycle is critical. This work presents ONC’s ongoing efforts to implement Local Contexts Traditional Knowledge and Biocultural Labels and Notices as part of its research data management infrastructure, bridging ethical principles with operational practice.

We describe how Local Contexts information is being integrated into ONC’s metadata profiles, dataset landing pages, and persistent identifier workflows using established standards such as ISO 19115 and DataCite, making the metadata human- and machine-readable. This approach ensures that governance signals, including community-defined use expectations and restrictions, remain visible and interpretable to both human and machine users as data moves through downstream discovery platforms and reuse pathways.

This work is being undertaken as a pilot project and proof of concept, using ONC-owned datasets within the Local Contexts Test Hub. Due to capacity constraints faced by many Indigenous communities, full implementation with community-generated labels is not yet in place. Instead, this pilot allows ONC to explore technical integration pathways, identify challenges related to metadata standardization and machine-readability, and develop documentation, guidance, and technical support in advance. This approach is intentionally designed to ensure that, when communities are ready to engage, they are provided with clear resources and meaningful options for participation without undue technical burden.

This case study demonstrates how Indigenous Data Sovereignty can be meaningfully embedded into existing Earth science data infrastructures without compromising FAIR principles or interoperability. By operationalizing CARE-aligned governance within metadata and identifier systems, this work offers a practical, scalable model for repositories seeking to support ethical, transparent, and community-centred data reuse in the Earth and environmental sciences.

How to cite: Ghafourian, S., Tippett, S., and Ridsdale, C.: Embedding Indigenous Data Governance in Research Data Infrastructures through Local Contexts, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14101, https://doi.org/10.5194/egusphere-egu26-14101, 2026.

EGU26-14565 | Posters on site | ESSI3.2

From Assessment to Action: ODATIS's Progressive Journey Toward FAIR Implementation in Ocean Sciences 

Erwann Quimbert and the ODATIS team

ODATIS, the ocean data hub within France's Data Terra research infrastructure, demonstrates how systematic progression from assessment through certification to innovation translates FAIR principles into sustainable community practices. Through three interconnected initiatives, ODATIS provides a replicable model for implementing FAIR while respecting domain-specific requirements.

Infrastructure Foundation

ODATIS operates through ten specialized Data and Service Centers (DSC) serving 130+ French research entities in physical oceanography, biogeochemistry, coastal observations, seafloor mapping, and marine ecosystems. This territorial network connecting national research infrastructure with local researchers provides the organizational foundation for systematic FAIR adoption. Two platforms anchor the infrastructure: SEANOE, a certified repository providing DOIs and preservation, and Sextant, a geographic catalog implementing ISO 19115 and OGC standards.

Assessment: The COPILOTE Project

Before imposing solutions, ODATIS assessed current capabilities through COPILOTE using the FAIR Data Maturity Model (FDMM). Evaluations revealed heterogeneous maturity levels and identified barriers: insufficient metadata, limited controlled vocabularies, unclear licensing, and inadequate provenance tracking. Participatory assessment engaged data managers and researchers in structured dialogue, transforming abstract FAIR concepts into concrete criteria. COPILOTE produced tailored improvement roadmaps demonstrating how standardized frameworks can respect institutional diversity while driving collective progress.

Certification: CoreTrustSeal Achievement

Building on assessment findings, ODATIS DSC pursued CoreTrustSeal certification, documenting organizational infrastructure, digital object management, and preservation capabilities. Successfully certified repositories including SEANOE achieved formal recognition of their trustworthiness, providing researchers with confidence in long-term data preservation and accessibility.

Innovation: The SO'Odatis Project

Funded by France's National Fund for Open Science, SO'Odatis develops integrated services making FAIR intrinsic to workflows. Four initiatives include: launching a diamond open-access journal linking publications with datasets and software; extending Sextant to catalog software with DOIs and Software Heritage integration; developing automated data paper generation from metadata; implementing comprehensive training through the correspondent network.

Cross-Disciplinary Lessons

ODATIS's journey demonstrates critical principles. Assessment before intervention reveals actual barriers and capabilities, preventing misdirected effort. Formal certification embeds FAIR into organizational culture beyond projects. Sustainable adoption requires reducing researcher burden through automation and workflow integration, not adding compliance tasks. Territorial networks enable bidirectional knowledge flow between infrastructure and communities. Critically, FAIR implementation is iterative, each phase builds on previous achievements while identifying new opportunities.

ODATIS offers a concrete roadmap: rigorous assessment identifies gaps; certification drives organizational maturity; innovation develops enabling tools; community engagement ensures relevance. This progression provides a replicable model for infrastructures translating FAIR principles into community-supported practices across Earth and environmental sciences.

How to cite: Quimbert, E. and the ODATIS team: From Assessment to Action: ODATIS's Progressive Journey Toward FAIR Implementation in Ocean Sciences, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14565, https://doi.org/10.5194/egusphere-egu26-14565, 2026.

EGU26-15996 | Orals | ESSI3.2

Enhancing discoverability and impact of dispersed data through persistent identifiers in Australia 

Julia Martin, Kerry Levett, and Hamish Holewa

Australian environmental, biodiversity and climate research generates vast and diverse datasets from a wide variety of organisations across the research, government, public and private sectors: all with significant potential to inform research, management and policy. However, these data are frequently stored across multiple institutional and government repositories that lack consistent governance, adequate rich metadata and consistent application of externally-agreed community standards that are fundamental to machine-to-machine discovery and interoperability. As a result, valuable long-tail data remain difficult to find, access and reuse, limiting their impact and hindering translation into decision-making and environmental management. National consultation led by the Australian Research Data Commons (ARDC) confirmed that poor discoverability of domain-specific data is a major barrier to research progress and evidence-based decision-making .

The Domain Data Portals (DDP) program, delivered through the ARDC Planet Research Data Commons, addresses this challenge by improving access to FAIR (Findable, Accessible, Interoperable and Reusable) environmental and climate data held in distributed repositories. The program equips data stewards with tools and capabilities to make long-tail datasets FAIR for knowledge creation. This program partners with the National Environmental Science Program (NESP), Australia’s longest-running environmental research initiative, and the Australian Plant Phenomics Network (APPN). NESP is led by the Australian Government Department of Environment, Climate Change, Energy and Water (DCCEEW) and has 29 research partner organisations. NESP has four hubs in different environmental disciplines: 1)marine and coastal, 2) terrestrial ecology, 3) waste and sustainability, and 4) climate systems. APPN is an Australian National  Collaborative Research Infrastructure Strategy (NCRIS) Facility with nine research nodes. The DDP program is working with data managers across the nodes and disciplines to harmonise data formats and workflows while respecting domain-specific requirements.

The program is delivering cohesive, domain-level discovery of NESP and APPN research outputs through a dedicated portal within ARDC Research Data Australia, which is a metadata aggregation service that enables findability, accessibility, and reuse of data for research from over one hundred Australian research organisations, government agencies, and cultural institutions. To enable Research Data Australia to programmatically harvest the NESP and APPN metadata into the relevant portal, ARDC and the DDP project leads have worked with the  institutions and repositories in scope to develop guidelines on how to include relevant Persistent Identifiers in the metadata for their funded research outputs and ensure rich FAIR-compliant metadata. By developing rich, standardised metadata for all project outputs and leveraging national infrastructure, including persistent identifiers, controlled vocabularies and data publishing services, the DDP program enables robust, efficient aggregation and national discoverability of datasets.

This approach supports consistent adoption of community standards and enhances data visibility, integration and reuse. The Domain Data Portals approach can be applied to other research communities in Australia to make their data FAIR, leveraging components of ARDC’s national information infrastructure.

How to cite: Martin, J., Levett, K., and Holewa, H.: Enhancing discoverability and impact of dispersed data through persistent identifiers in Australia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15996, https://doi.org/10.5194/egusphere-egu26-15996, 2026.

EGU26-16338 | Orals | ESSI3.2

Today’s research for tomorrow’s challenges – building national research infrastructure across the full data life cycle 

Tim Rawling, Angus Nixon, Bryant Ware, Alex Hunt, Jens Klump, Anusuriya Devaraju, Rebecca Farrington, and Lesley Wyborn

As the volume and complexity of Earth science data continues to grow, driven by the availability of advanced instrumentation and requirement for new approaches to address geoscience questions and challenges, there is an increasing need for robust, end-to-end approaches to data management across the full data life cycle. Earth science datasets are, however, notoriously heterogeneous, spanning disciplines from geochemistry to geophysics and Earth observation, at observation levels from nanoscale to global, and amassing data volumes from megabytes to multi-petabyte collections. Yet for the vast majority of these datasets, the ‘raw’ observations collected by instrumentation, or Primary Observational Datasets (PODs), are not routinely reported or associated with the downstream, analysis-ready data products used to inform scientific or policy decisions. To enable reproducible and repurposable science particularly in a context where technical advances continue to push the data requirements upstream towards the primary observations, these PODs must be preserved for potential future applications and linked with the outputs they underpin.  

AuScope is Australia’s national geoscience research infrastructure funded through the National Collaborative Research Infrastructure Strategy (NCRIS), supports the geoscience community by providing data, data products, and software that align with the FAIR and CARE principles. Recognising that a single, monolithic repository cannot serve all disciplines, data types, or user communities, AuScope is developing an Earth Science Data Ecosystem that enables seamless access to PODs hosted across high-performance compute–data (HPC-D) and cloud environments, and provides pathways to connect raw observational data with curated, analysis-ready products delivered through distributed platforms and portals. A critical component of this ecosystem is strengthening digital infrastructure at the point of data generation and associating that primary observation with the published output. To address persistent challenges associated with manual data transfer, incomplete metadata capture, and limited long-term reuse, AuScope has embarked on the scoping and implementation of an Australian-first repository and capture system for PODs in geochemistry. By strengthening digital infrastructure at the point of data generation and embedding standards throughout the data life cycle, this work supports more efficient, interoperable, and collaborative Earth science research, maximising the long-term value of publicly funded data. 

How to cite: Rawling, T., Nixon, A., Ware, B., Hunt, A., Klump, J., Devaraju, A., Farrington, R., and Wyborn, L.: Today’s research for tomorrow’s challenges – building national research infrastructure across the full data life cycle, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16338, https://doi.org/10.5194/egusphere-egu26-16338, 2026.

EGU26-16876 | Orals | ESSI3.2

What happens when FAIR is built in from the start? Insights from the GOYAS Project 

Fernando Aguilar Gómez, Daniel García Díaz, Antonio López, Aina García-Espriu, and Cristina González-Haro

The Geospatial Open Science Yielding Applications (GOYAS) project, developed under the Horizon Europe OSCARS framework, demonstrates a comprehensive pathway from FAIR principles to operational practice for Earth observation (EO) data products. While the FAIR principles (Findable, Accessible, Interoperable, Reusable) are widely endorsed by research data communities, translating them into reproducible and scalable workflows across heterogeneous data providers remains challenging. This contribution presents concrete results and lessons learned from GOYAS project, which has developed and implemented a FAIR-by-design system that supports community adoption and cross-disciplinary data reuse.

At its core, GOYAS comprises a set of customized software components, including an automated data production pipeline, a georeferenced data repository, and an OGC-standard API endpoint. The data ingestion pipeline integrates automation that reduces the initial effort required from data producers to generate FAIR data, by automatically producing standardized metadata, provenance information, and quality metrics as a by-product of routine processing. This approach enables transparency, consistency, and long-term reuse across all stages of the data lifecycle. To enforce the “F” of Findability, persistent identifiers (PIDs) are minted for mature data products using EOSC-Beyond services, ensuring persistent, machine-actionable references and reliable data product traceability.

A key outcome of GOYAS is the implementation of a validation framework that acts as a prerequisite for the publication of final data products, whereby persistent identifiers are assigned only to validated outputs. Each product undergoes:

  • Metadata standard validation, ensuring compliance with agreed schemas and machine-readability requirements (ISO 19139);

  • INSPIRE alignment, verifying that spatial data components meet European geospatial interoperability standards;

  • FAIRness evaluation using FAIR EVA (Evaluator, Validator and Advisor), assessing the degree to which products comply with FAIR principles through automated tests.

Only when all validation checks are successfully passed is a product considered mature for publication and assigned a persistent identifier (PID), thereby guaranteeing discoverability and long-term referenceability within EOSC and beyond.

We discuss how FAIR-by-design principles were embedded at key architectural layers, including metadata generation, PID minting, and automated quality assessment, and how these design choices support not only technical interoperability but also community adoption. Lessons learned highlight the importance of early integration of FAIR requirements into workflow design, the practical challenges of harmonizing cross-domain standards (FAIR and INSPIRE), and the role of automation in enabling scalable FAIR implementations without imposing additional effort on data producers.

By providing a documented and operational model that combines FAIR principles, persistent identification, standards compliance, and automated validation, GOYAS advances the practical implementation of FAIR and open data management in environmental sciences and offers transferable insights for related research communities.

How to cite: Aguilar Gómez, F., García Díaz, D., López, A., García-Espriu, A., and González-Haro, C.: What happens when FAIR is built in from the start? Insights from the GOYAS Project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16876, https://doi.org/10.5194/egusphere-egu26-16876, 2026.

EGU26-18803 | Posters on site | ESSI3.2

Operationalizing Data Fitness-for-Purpose Through Standardized Metrics, Local Uncertainty, and LLM-Extracted Quality Reasoning  

Markus Möller, Mahdi Hedayat Mahmoudi, and Paul Peschel

Making geospatial data FAIR requires more than metadata standardization - it demands transparent, structured reporting of data quality and uncertainty that allows researchers to assess fitness-for-purpose across diverse applications. Yet most FAIR implementations still treat quality as a generic metadata field, while uncertainty and fitness‑for‑purpose remain buried in narrative documentation and disciplinary tacit knowledge.

In the FAIRagro consortium, we operationalize an application‑oriented quality framework using the example of  Germany‑wide phenology time series (1 km, 1993-2022) by combining three components: (1) standardized producer‑side quality metrics (global R² and RMSE following ISO 19157‑1 for each crop, phase, and year), (2) spatially explicit local uncertainty layers, and (3) a machine‑actionable, application‑specific data quality matrix (AS‑DQM) that captures documented use contexts, validation strategies, limitations, and fitness‑for‑purpose statements from existing publications and workflow descriptions. Large Language Models (LLMs) are central to this workflow: after structure‑preserving conversion of PDFs to enriched Markdown, multimodal LLMs extract quality‑relevant concepts from text, tables, and figures, normalize them against a formal schema, and generate provenance‑linked AS‑DQM JSON profiles that can be queried and reused across applications.

These quality, uncertainty, and fitness profiles are then packaged as FAIR Digital Objects using interoperable containers (ARCs) for version‑controlled, reproducible workflows and RO‑CRATE standards for structured research object metadata - enabling seamless integration with research data management infrastructure and discovery systems. This approach ensures that quality reasoning, local uncertainty estimates, and application contexts travel together with phenology data through the research lifecycle, preserving provenance and enabling automated quality‑aware dataset selection.

This poster represents a transferable template for domain-specific FAIR implementation, demonstrating that structured uncertainty reporting, ISO-compliant quality metrics, LLM-assisted formalization of fitness-for-purpose information, and user-centered fitness-for-purpose assessments are essential bridges between abstract FAIR principles and practical, cross-disciplinary data reuse. For application, users can query not only "where are data FAIR?" but "where are data sufficiently accurate, well‑validated, and uncertainty‑constrained for this specific decision context?". By embedding LLM‑derived quality knowledge, uncertainty products, and an application matrix into machine‑actionable FAIR Digital Objects, we move from static compliance towards dynamic, evidence‑based fitness‑for‑purpose assessment - thereby strengthening trust in public data sets.

How to cite: Möller, M., Hedayat Mahmoudi, M., and Peschel, P.: Operationalizing Data Fitness-for-Purpose Through Standardized Metrics, Local Uncertainty, and LLM-Extracted Quality Reasoning , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18803, https://doi.org/10.5194/egusphere-egu26-18803, 2026.

EGU26-18884 | Orals | ESSI3.2

Scaling FAIR Data Practices in Climate Modelling  

Kelsey Druken, Joshua Torrance, Romain Beucher, Martin Dix, Aidan Heerdegen, Paige Martin, Charles Turner, and Spencer Wong

Making research data Findable, Accessible, Interoperable and Reusable (FAIR) is now widely recognised as essential for open and reproducible science. In practice, however, translating FAIR principles into everyday data management remains challenging, particularly in climate modelling, which involves large data volumes and complex software and data environments on high-performance computing (HPC) platforms. Research rarely follows a simple path from data generation to publication, and FAIR is still often treated as a final, optional step rather than as a set of practices embedded and maintained throughout scientific workflows. 

We present a case study from Australia’s Climate Simulator (ACCESS-NRI) that examines how FAIR principles can be advanced through two complementary approaches applied in parallel. One focuses on the social and practical aspects of FAIR, supporting researchers to apply FAIR practices as part of their everyday research activities. The other centres on embedding FAIR directly into tools and processes, thereby reducing reliance on manual effort and helping to minimise the errors and inconsistencies that naturally arise in complex, collaborative environments. 

Through an open, merit-allocation based approach, ACCESS-NRI provides multiple data sharing pathways, from shorter-term spaces that support active development and collaboration to more curated, publication-ready datasets for longer-term access. This staged model supports the progressive application and uplift of FAIR practices as data are generated, shared, and refined over time, substantially streamlining later curation. Alongside this, we have also focused on improving the consistency and standardisation of ACCESS model outputs by embedding established community conventions and defined data specifications directly in the ACCESS software and release processes. This helps reduce variation across model outputs, supports reuse across tools and researchers, and shifts FAIR from a largely manual effort towards standard practice. 

This case study demonstrates how FAIR principles can be advanced through practical, community-aligned approaches that fit within real research contexts. For ACCESS-NRI, these efforts provide a foundation for tackling deeper FAIR data challenges, with lessons that are relevant to other Earth and environmental science domains facing similar constraints. 

How to cite: Druken, K., Torrance, J., Beucher, R., Dix, M., Heerdegen, A., Martin, P., Turner, C., and Wong, S.: Scaling FAIR Data Practices in Climate Modelling , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18884, https://doi.org/10.5194/egusphere-egu26-18884, 2026.

EGU26-18954 | ECS | Posters on site | ESSI3.2

EOPF Toolkit: Engaging the Sentinel community to adopt the EOPF Zarr data format 

Gisela Romero Candanedo, Julia Wagemann, Sabrina H. Szeto, Emmanuel Mathot, Felix Delattre, Ciaran Sweet, James Banting, Sharla Gelfand, and Tom Christian

The European Space Agency (ESA), through the Earth Observation Processor Framework (EOPF), is reprocessing Sentinel-1, -2, and -3 archives into the cloud-optimised format Zarr. Through the EOPF Sentinel Zarr Samples Service, Sentinel data users can get early access to sample data in the new EOPF Zarr format.

The ESA-funded EOPF Toolkit project supports users transitioning from the legacy .SAFE Sentinel format to the cloud-optimised EOPF Zarr standard. The core development is EOPF 101, a comprehensive online resource designed to help users explore EOPF Sentinel Zarr data in the cloud. Through step-by-step and hands-on tutorials, Sentinel data users learn how to effectively use EOPF Sentinel Zarr products and build Earth Observation workflows that scale.

Chapter 1 - About to EOPF provides a high-level, easy-to-understand overview of the EOPF project by ESA. Chapter 2 - About EOPF Zarr provides a practical introduction to the cloud-optimised Zarr data format. It shows the benefits of the format, gives an overview of the data structure and includes performance comparisons with other formats. Chapter 3 - About Chunking provides an introduction to the chunking paradigm and lets users explore how to optimise their workflow. Chapter 4 - About EOPF STAC gives easy-to-understand practical examples on how to discover and access data with the EOPF STAC catalog. Chapter 5 - Tools to work with Zarr provides a collection of practical examples of languages, libraries and plug-ins that support users in working with data from the EOPF Samples Service. Chapter 5 - EOPF in Action is a collection of hands-on, practical end-to-end workflows featuring the use of EOPF Zarr data in different application areas.

Besides EOPF 101, the project had additional community engagement activities such as a notebook competition and a collaboration with Champion Users. The notebook competition took place between October 2025 and January 2026. During this period, the Sentinel data community was invited to try out the new EOPF Zarr data format themselves and share their workflows in the form of Jupyter Notebooks. The project further engaged with five organisations (Champion Users) to develop end-to-end workflows in different application domains

The EOPF Toolkit bridges the gap between data provision and practical application through three pillars of engagement: structured learning, expert guidance, and competitive innovation. While the EOPF 101  provides the foundational roadmap, Champion Users offer expert-level insights, and the notebook competition builds a library of community-sourced examples. Together, these initiatives create a feedback loop that transforms new adopters into active contributors, reducing the time-to-insight to the EOPF Zarr data format.

In this presentation, we will provide an overview of the community resources developed under the EOPF Toolkit and will share lessons learned from the community engagement activities.

How to cite: Romero Candanedo, G., Wagemann, J., H. Szeto, S., Mathot, E., Delattre, F., Sweet, C., Banting, J., Gelfand, S., and Christian, T.: EOPF Toolkit: Engaging the Sentinel community to adopt the EOPF Zarr data format, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18954, https://doi.org/10.5194/egusphere-egu26-18954, 2026.

EGU26-19834 | Posters on site | ESSI3.2

EMODnet Chemistry and FAIR principles; evaluating and updating vocabularies 

Megan Anne French, Blakeman Samantha, Alessandra Giorgetti, Hans Mose Hansen, Marina Lipizer, Maria Eugenia Molina Jack, Gwenaelle Moncoiffe, Anna Osypchuk, and Matteo Vinci

The European Marine Observation and Data Network (EMODnet) was established in 2009 and is proposed as the European Commission (EC) in situ marine data service of the EC Directorate-General Maritime Affairs and Fisheries (DG MARE). EMODnet represents a network of organisations providing free access to European marine data available as interoperable data layers and data products for seven themes: Bathymetry, Geology, Physics, Chemistry, Biology, Seabed habitats, and Human activities. EMODnet Chemistry makes aggregated data collections and products available for contaminants, eutrophication, and marine litter following the Findable, Accessible, Interoperable, and Reusable (FAIR) principles (Wilkinson et al., 2016); for instance, the use of standardised vocabularies supports findability, interoperability, and reuse. EMODnet Chemistry uses the standardised, hierarchically mapped vocabularies of the Natural Environment Research Council (NERC) Vocabulary Server (NVS, managed by the British Oceanographic Data Centre (BODC)) for indexing and annotating meta(data). For example, the BODC Parameter Usage Vocabulary (P01, https://vocab.nerc.ac.uk/search_nvs/P01/) is used to describe variables by providing detailed information on the target chemical object (S27 vocabulary) or property and the matrix/medium including phase, while the SeaDataNet Parameter Discovery Vocabulary (P02, https://vocab.nerc.ac.uk/search_nvs/P02/) and EMODnet Chemistry chemical groups (P36, https://vocab.nerc.ac.uk/search_nvs/P36/) are used to group P01s. Recently, working group activities evaluated EMODnet Chemistry vocabulary issues and needs and proposed improvements; for example, deprecating and replacing the P36 for polychlorinated biphenyls with a new P36 for organohalogens. Thus, some new P36 vocabularies were created/deprecated and the names and definitions of other P36 chemical groups were revised for correctness and to ensure that lower-level vocabularies could be mapped. This work resolved numerous mapping issues for EMODnet Chemistry, allowing all chemical substances to be mapped, making more data findable and interoperable in EMODnet. It also increased alignment with the vocabularies of the International Council for the Exploration of the Sea (ICES). Overall, these efforts improve EU marine data management and support alignment with other EU frameworks.

 

Reference

Wilkinson et al., 2016. The FAIR Guiding Principles for scientific data management and stewardship. 10.1038/sdata.2016.18

How to cite: French, M. A., Samantha, B., Giorgetti, A., Hansen, H. M., Lipizer, M., Molina Jack, M. E., Moncoiffe, G., Osypchuk, A., and Vinci, M.: EMODnet Chemistry and FAIR principles; evaluating and updating vocabularies, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19834, https://doi.org/10.5194/egusphere-egu26-19834, 2026.

EGU26-20023 | Orals | ESSI3.2

Paving the Road to FAIR – Strategies and Considerations to activate PIDs in a large Organization 

Emanuel Soeding, Dorothee Kottmeier, Andrea Poersch, Stanislav Malinovschii, Johann Wurz, and Sören Lorenz

At the Helmholtz Association, we aim to establish a harmonized data space that connects information across distributed infrastructures. Ideally, this should work within and beyond our organization. Achieving requires standardizing dataset descriptions using suitable metadata. A handy strategy is, to use persistent identifiers (PIDs) and their metadata records to harmonize central parts of the metadata. This will ensure a first level of interoperability and machine actionability even between discipline-unrelated datasets.

While harmonizing PID metadata is a key step, practical implementation depends on a number of factors: 1. Leadership, to support the necessary change processes, 2. A general awareness of roles and responsibilities across the whole research organization, 3. An implementation plan that prioritizes tasks, identifies the right people and interfaces, and specifies the tools and services required to record metadata. 4. An implementation group comprising people with the relevant expertise to implement and communicate the change process, 5. Informational material and training, to onboard the ones who are affected by change, 6. an organization's management supporting the upcoming change, and 7. Funding to be able to overcome the initial obstacles and get everything up and running.

For example, ORCID identifies research contributors. While often associated with publishing scientists, other contributors—such as technicians, data managers, and administrative staff—also play vital roles. Their contributions are often overlooked or not systematically recorded. To change this, PID workflows should begin early, ideally at the hiring stage, to ensure people's roles are captured and linked to datasets.

Similarly, the PIDINST system—developed by an RDA working group—provides unique identifiers for scientific instruments. It includes a simple schema for recording key metadata about instruments, enabling the reliable identification of measurements made with specific devices. Here, workflows should begin with instrument acquisition and include responsibilities for updating metadata, typically assigned to technicians.

In this presentation, we propose tailored PID workflows involving key stakeholder groups within Helmholtz. We outline strategies for implementing ORCID, ROR, PIDINST, IGDS, DataCite and CrossRef DOIs and assign responsibilities for metadata curation. Our goal is to embed PID usage in day-to-day research processes across all centers of our organization and clarify stakeholder roles, thereby strengthening metadata quality and data interoperability of our metadata.

How to cite: Soeding, E., Kottmeier, D., Poersch, A., Malinovschii, S., Wurz, J., and Lorenz, S.: Paving the Road to FAIR – Strategies and Considerations to activate PIDs in a large Organization, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20023, https://doi.org/10.5194/egusphere-egu26-20023, 2026.

EGU26-20356 | Posters on site | ESSI3.2

The role of domain repositories in sustaining high-quality data publications: researcher-oriented tools and strategies under limited resources  

Kirsten Elger, Alexander Brauser, Holger Ehrmann, Ali Mohammed, and Melanie Lorenz

In the geosciences, most research results are supported by data. These data are measured, collected, generated or compiled by humans or machines (including numerical modelling) and they represent an increasingly important part of the research outcome. They should be made available and shared in openly in a reusable format wherever possible, while fully acknowledging the contributions of the individual researchers and institutions that collected or generated the data.

Research data repositories are permanent archives that provide access to data, metadata to related physical samples, as well as scientific software. An increasing number of repositories are assigning digital object identifier (DOI) to the data stored in their archives. The range of services offered includes fully self-service DOIs at large generic repositories, to institutional repositories that are open to institutional members only, and curated data publications by domain repositories specialising in data from a specific scientific field.

The involvement of skilled data curators, who are often also domain researchers, makes domain repositories the preferred destination for the publication of well-documented and reusable data. The generic metadata required for DOI registration is complemented by extensive, domain-specific metadata properties, such as the information on the temporal and geospatial domains, mineral or rock names, instruments and analytical methods. Ideally, this information derives from embedded controlled vocabularies or ontologies, which increase the discoverability of the data for humans and machines. During curation, author information is also supplemented with ORCID and ROR identifiers, and the published data is digitally connected to related research articles, datasets, software, and the physical samples from which the data were obtained. However, they are facing challenges due to insufficient staff to uphold these high publication standards. Unfortunately, the resulting delay in processing requests directs many researchers to generic repositories offering self-service DOIs that do not provide any data curation.

To address these challenges, GFZ Data Services provides intuitive tools for collecting rich metadata (metadata editors), data description templates with extensive explanations and online instructions on recommended file formats, for example. These tools enable researchers to provide high-quality metadata from the outset, thereby reducing the workload and time required for data curation.

In November 2025, GFZ Data Services launched ELMO, the fully revised and modernised version of our metadata editor. ELMO is not only a new web interface, but also contains many new features that improve the quality of metadata and the FAIRness of the data it describes, while simplifying the entry of information for researchers. For example, authors' names and institutions can be automatically entered by entering the ORCID; affiliations can be selected from a drop-down menu linked to the Registry of Research Institutions (ROR); and the controlled, linked data vocabularies already in use (e.g., GCMD and geosciML) are directly connected to the vocabulary services API, thus ensuring they are always up to date.

This presentation will outline the advantages and disadvantages of domain repositories, and introduce our new metadata editor ELMO.

How to cite: Elger, K., Brauser, A., Ehrmann, H., Mohammed, A., and Lorenz, M.: The role of domain repositories in sustaining high-quality data publications: researcher-oriented tools and strategies under limited resources , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20356, https://doi.org/10.5194/egusphere-egu26-20356, 2026.

Making research data Findable, Accessible, Interoperable, and Reusable (FAIR) is widely recognised as essential for open and reproducible science. However, researchers often face a gap between FAIR-compliant datasets and data that are actually fit for specific scientific or operational applications. This gap arises because data quality is inherently application-dependent, while critical assumptions, limitations, and uncertainty characteristics are frequently documented only implicitly across publications, dataset metadata, and workflow descriptions. 

We present a document-driven, application-oriented approach to data quality assessment developed within the FAIRagro initiative. 
The method uses the \textbf{Application-Specific Data Quality Matrix (AS-DQM)}, which systematically captures reasoning linking documented data characteristics—such as spatial and temporal resolution, validation strategies, and known limitations—to application requirements and explicit fitness-for-Purpose statements (\href{https://zenodo.org/records/17981173}{FAIRagro resources}). Rather than computing new quality metrics, the AS-DQM formalizes existing knowledge already generated by research communities, reduces barriers to adoption, and supports responsible data reuse. 

The approach is illustrated using a Germany-wide phenology time series as a pilot example. By analysing dataset documentation together with a concrete phenology-based scientific studies, the AS-DQM demonstrates how application-specific quality requirements—such as acceptable temporal uncertainty, spatial aggregation assumptions, and suitability for regional-scale analyses—can be systematically extracted and made explicit. Comparing the resulting application-level quality profile with the dataset-level documentation shows how fitness-for-Purpose emerges from the interaction between data characteristics and application context, highlighting cases where datasets are conditionally suitable or explicitly unsuitable for specific analyses. 

We discuss strengths, limitations, and adoption challenges of document-driven, application-oriented data quality reasoning, emphasizing its broad relevance across Earth and environmental sciences and its role in fostering sustainable, community-driven FAIR data practices.

How to cite: Hedayat Mahmoudi, M. and Möller, M.: From FAIR Principles to Fitness-for-Purpose: Document-Driven, Application-Oriented Data Quality in Agrosystem Research, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21042, https://doi.org/10.5194/egusphere-egu26-21042, 2026.

EGU26-21662 | Posters on site | ESSI3.2

Integration of TERENO into the DataHub Digital Ecosystem 

Ralf Kunkel, Marc Hanisch, Christof Lorenz, Ulrich Loup, David Schäfer, Thomas Schnicke, and Jürgen Sorg

In Earth sciences, there is an increasing demand for long-term observation data related to the hydrosphere, pedosphere, biosphere, and lower atmosphere across multiple spatial and temporal scales. In parallel, standardized methods to manage, find, access, provide interoperability, and reuse these data (FAIR) have been developed. Numerous centralized or distributed data infrastructures (thematic silos) exist, often with similar architectures but with a diversity of access methods, vocabularies for description, and frameworks for handling data and data flows.

DataHub is an initiative of the German Helmholtz Research Field Earth and Environment (E&U) with the aim of developing and operating a scalable, FAIR, and distributed digital research infrastructure to link research data from all compartments of the Earth system. By coordinating vocabularies, persistent identifiers (PIDs), and a common nomenclature across centres, DataHub ensures interoperability with national and international systems. The goal is the transition from isolated silos to interdisciplinary infrastructures. This is achieved by creating a community-driven digital research data ecosystem characterized by collaborative software development; the provision and use of products under a common open-source license model; a harmonized architecture of data management systems; connectivity of data via standardized interfaces (e.g., OGC STA, CSW, WMS); and, most importantly, the harmonization of data descriptions and data flows. As a first step, existing data infrastructures are integrated into the jointly developed DataHub environment.

TERENO (TERrestrial ENvironmental Observatories) is used as a reference implementation for the integration of an existing distributed data infrastructure into DataHub. TERENO is an interdisciplinary, long-term research program involving five centres of the German Helmholtz Association (FZJ, GFZ, UFZ, KIT, DLR). Running since 2008, it comprises an Earth observation network across Germany and provides long-term environmental data at multiple spatial and temporal scales to study the long-term impacts of land-use and climate change. It provides more than 3.3 billion observations from over 900 sites.

During the last decade, several drawbacks have been identified in the operation of TERENO, such as inhomogeneities in metadata describing measurement instrumentation and the observed data themselves. Moreover, different data quality routines and assessment schemes are applied.

How to cite: Kunkel, R., Hanisch, M., Lorenz, C., Loup, U., Schäfer, D., Schnicke, T., and Sorg, J.: Integration of TERENO into the DataHub Digital Ecosystem, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21662, https://doi.org/10.5194/egusphere-egu26-21662, 2026.

EGU26-21754 | Orals | ESSI3.2

Collaborative Governance Solutions for NASA Science Mission Directorate (SMD) Data, Information, and Software 

Kaylin Bugbee, Deborah Smith, Emma Koontz, Rhea Bridgeland, Emily Foshee, Jaclyn Stursma, Dhanur Sharma, Rishab Dey, and Fred Kepner

Effective data governance requires a collective approach rather than isolated efforts. To achieve this, the NASA Science Mission Directorate (SMD) governance team—part of the Data and Analysis Services Project (DASP)—is implementing a strategy to support the Chief Science Data Officer’s vision for interdisciplinary, interoperable open science. The DASP governance team focuses on several key functions. First, the governance team has developed a framework to create governance and guidance for the data, information, and software used across the SMD community to ensure compliance with agency and government policies. The current governance model employs a rapid-response approach, using focused initiatives to identify high-priority needs and develop practical solutions. Second, the DASP governance team works to streamline operations and reduce friction for scientists and data stewards by utilizing automation and targeted training. Third, the DASP governance team is building a robust community of data repositories to empower open science and foster collaboration between divisions. To enhance these efforts, DASP has launched a centralized online hub designed to strengthen connections between SMD data stewards. This centralized platform allows for governance initiative reviews, community updates and sharing of relevant resources. This presentation will share the high-level SMD governance process, the development of the centralized governance community platform, and lessons learned from the first initiatives developed via the governance process. 

How to cite: Bugbee, K., Smith, D., Koontz, E., Bridgeland, R., Foshee, E., Stursma, J., Sharma, D., Dey, R., and Kepner, F.: Collaborative Governance Solutions for NASA Science Mission Directorate (SMD) Data, Information, and Software, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21754, https://doi.org/10.5194/egusphere-egu26-21754, 2026.

Abstract Text (235 words):
Making marine geospatial data Findable, Accessible, Interoperable, and Reusable (FAIR) remains challenging for researchers and policy implementors, particularly in integrating geological and biological datasets for Special Areas of Conservation (SACs) management. This contribution shares experiences developing domain-specific FAIR workflows for west coast Ireland SACs (Porcupine Seabight, Belgica Mound, Inisheer Island), harmonizing INFOMAR multibeam data, EMODnet Geology, OBIS biodiversity, and Copernicus currents via the European Digital Twin Ocean (EDITO) and Destination Earth (DestinE) platforms (and others).

Seabed integrity metrics (e.g., Bedrock Suitability Index information) and substrate maps (85% accuracy, Random Forest classification) will be processed on available platforms, e.g., EDITO and DestinE HPC, post-QC for best possible and valid geometries and INSPIRE compliance. Biodiversity connectivity matrices (previous published work and code from the coastalNet R package will be cited and explored), pairwise probabilities e.g., 0.35 Belgica-to-Porcupine) overlay oceanographic simulations (e.g., ESRI EMUs), deposited as interoperable WMS layers on Figshare DOIs with plain-language metadata and APIs.

Specific challenges include integrating "dark" datasets and bridging technical-policy gaps; solutions involved AI-driven summarization, automated versioning, and user-centric pilots (e.g., co-design workshops, tracking download rates, policy citations). Additional challenges include alignment with MSFD thresholds (>25% degraded seabeds) and OSPAR goals fostered adoption, with sensitivity analyses (low BSI reduces connectivity 20-40%) potentially useful for informing trawling vignettes and conservation and restoration efforts (reefs on BSI>0.7).

This approach respects ocean science needs while promoting cross-disciplinary understanding and reuse (e.g., hydrology via sediment mobility), demonstrating cultural shifts through stakeholder panels and GDPR-compliant training toolkits. Outcomes advance RDA ESES goals by scaling FAIR practices for real-time AI dashboards, inviting dialogue on community-driven refinement.

How to cite: Auerbach, J. and Crowley, Q.: FAIR Marine Data Workflows for Policy: Unifying Seabed Integrity and Connectivity in Irish SACs via EDITO and DestinE, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21811, https://doi.org/10.5194/egusphere-egu26-21811, 2026.

EGU26-22121 | Posters on site | ESSI3.2

Translating FAIR Principles into Practice: Lessons from Four Decades of Cryospheric Data Stewardship 

Donna Scott, Siri Jodha Singh Khalsa, Shannon Leslie, Amanda Leon, Amy Steiker, and Ann Windnagel

Applying the FAIR (Findable, Accessible, Interoperable, and Reusable) principles to enable open and reproducible science is now a core goal across research communities. Yet, for well-established data centers and specialized domains, translating these principles into everyday, sustainable practice remains a significant challenge. Using the National Snow and Ice Data Center (NSIDC) as a case study—founded in 1976 as the World Data Center for Glaciology—we examine how legacy data holdings, evolving research practices, and emerging standards converge in the pursuit of FAIR-aligned stewardship.

This presentation highlights both progress and hurdles in modernizing four decades of passive microwave snow and ice data records from SMMR, SSM/I, and SSMIS sensors managed by the NSIDC Distributed Active Archive Center (DAAC) and NOAA@NSIDC data programs. Many of these data products predate mature standards for metadata, provenance, and interoperability standards, originally distributed in basic binary formats with limited documentation and access options. We describe efforts to migrate these legacy products to self-describing formats, enhance provenance, improve transparency, broaden accessibility and services, and align repository operations with contemporary expectations for FAIR and Open Science.

 

Equally important are the cultural and organizational shifts needed to foster engagement among  researchers, data producers, and data managers in adopting and refining best practices that serve the cryospheric community’s specific needs. We share strategies for balancing standardization with domain-specific requirements, and reflect on how lessons learned from cryospheric data stewardship may inform broader FAIR implementation across the Earth sciences. By sharing these experiences, we hope to contribute to interdisciplinary dialogue on building sustainable, community-driven data ecosystems that support open and reproducible scientific research.

How to cite: Scott, D., Khalsa, S. J. S., Leslie, S., Leon, A., Steiker, A., and Windnagel, A.: Translating FAIR Principles into Practice: Lessons from Four Decades of Cryospheric Data Stewardship, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22121, https://doi.org/10.5194/egusphere-egu26-22121, 2026.

EGU26-22747 | Orals | ESSI3.2

Fostering cross-disciplinary dialogue and credit attribution practices through Science Explorer, a digital library that tracks impact of literature, software and data 

Anna Kelbert, Alberto Accomazzi, Edwin Henneken, Kelly Lockhart, Jennifer Bartlett, and Michael Kurtz
The NASA-funded Science Explorer (SciX) is an open, curated information discovery platform for Earth and space science providing trusted access to interdisciplinary scientific resources. Developed as an extension of the Astrophysics Data System (ADS), a cornerstone of scholarly communication in astrophysics, SciX is designed to serve a broader scientific community, with a strong focus on supporting Earth science research, applications, and societal decision-making.
 
At the heart of SciX is a carefully curated database, where all indexed content (literature, datasets, and software) is sourced from reputable, authoritative providers. This ensures that users engage only with credible scientific information, making SciX a trusted environment for
discovery and decision support. The system integrates peer-reviewed research, preprints, conference and meeting abstracts, funded projects, mission and archival datasets, and software tools across domains, fostering connections between Earth and space sciences. This multidisciplinarity is essential for addressing complex societal challenges such as climate adaptation, disaster resilience, as well as larger research questions such as the origin of the solar system and the presence of life in the universe. The key ingredient that SciX is providing is a unified and precise, full-text search across these curated resources. We discuss our efforts to enrich these resources with common disciplinary and cross-disciplinary controlled vocabularies
to enhance findability and cross-disciplinary dialogue.
 
We also discuss our efforts to build a knowledge graph at SciX that connects the literature and the data and software resources, exposing the use of data and software in research and tracking the impact of these resources. In doing so, we hope to facilitate a cultural shift in the Earth and space science communities to streamline adoption of data and software citations, and to better align academic incentives with FAIR practices that have broad societal impact, such as metadata transparency, and resource accessibility and reuse.

How to cite: Kelbert, A., Accomazzi, A., Henneken, E., Lockhart, K., Bartlett, J., and Kurtz, M.: Fostering cross-disciplinary dialogue and credit attribution practices through Science Explorer, a digital library that tracks impact of literature, software and data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22747, https://doi.org/10.5194/egusphere-egu26-22747, 2026.

EGU26-1499 | ECS | Posters on site | ESSI3.4

Navigating legacy Earth System Model software 

Lakshmi Aparna Devulapalli

As a Research Software Engineer in the natESM project, you have the opportunity to work with a wide range of Earth System Models (ESMs) developed by the German scientific community. Many of these models, originating in the 1990s, were predominantly written in Fortran. While the broader scientific software world has since transitioned toward languages such as C/C++ and Python, the ESM community is still in the process of catching up. As a result, legacy Fortran code—often 20 years old or more—presents unique and sometimes amusing challenges when attempting to adapt or port to modern technologies.

This talk offers a humorous look at these challenges through the eyes of an RSE navigating outdated code in order to accomplish present-day tasks. Topics will include unsustainable methods of structuring software, relic configuration files used for input, ambiguous naming conventions, unused or nonfunctional code that has never been removed, version control practices that can be improved, and other long-standing programming habits that need to evolve. The session will also highlight more modern and maintainable alternatives to these practices, offering a lighthearted yet constructive perspective on bringing legacy ESM code into the future.

How to cite: Devulapalli, L. A.: Navigating legacy Earth System Model software, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1499, https://doi.org/10.5194/egusphere-egu26-1499, 2026.

natESM is a project that brings together German resources to develop a seamless, multiscale Earth System Modelling framework that can serve multiple purposes. This system is composed of several independent and diverse software models from the community, each addressing different parts of the Earth system. Given the variety of programming languages, model sizes and software architectures involved, as well as different experience among the responsible model developers, challenges arise in portability, performance and software quality. 

A key part of the natESM approach is the technical support to model developers provided by Research Software Engineers (RSEs). Their work focuses not only on integration, portability and performance, but also on systematically improving software quality within and across model components. This talk will outline the progress made so far, highlight lessons learned from the RSE-scientist collaborations, and present our future plans for assessing and enhancing software quality. The experiences and methods developed in natESM might serve as an example for improving software sustainability in Earth System Modeling more broadly.

How to cite: Loch, W. J.: The natESM Journey for Improving Software Quality in Earth System Modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1645, https://doi.org/10.5194/egusphere-egu26-1645, 2026.

Scientific software often begins as an internal research tool developed by scientists rather than trained software engineers, resulting in limited usability, documentation, and maintainability. emiproc, a tool for processing emission inventories for atmospheric chemistry and transport models, originally followed this trajectory: it grew organically within our laboratory, offered only a command-line interface, and lacked a clear structure, extensibility, and user-oriented documentation. We recently undertook a full modernization of emiproc following the best practices in scientific software development: redesign of the code base into modular components, consistent object oriented Python API, automated testing with continuous integration, extensive documentation for both users and developers and publication in the Journal of Open Source Software. The updated software now supports some of the most widely used emission inventories such as EDGAR and CAMS, and more specific ones like the City of Zurich inventory, and produces output for various transport models like ICON-ART, WRF, or GRAL. We will highlight our approaches for transforming emiproc into a sustainable and user-friendly tool and reflect on the challenges we encountered along the way. By sharing our experience, we aim both to contribute to the discussion on improving scientific software development and to learn from the approaches used by others. 

How to cite: Constantin, L. and Brunner, D.: Scientific Software Developement: Lessons from our Emission inventory processing software emiproc  , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3484, https://doi.org/10.5194/egusphere-egu26-3484, 2026.

Geochemistry π is an open-source automated machine learning Python framework. Geochemists need only provide tabulated data (e.g. excel spreadsheet) and select the desired options to clean data and run machine learning algorithms. The process operates in a question-and-answering format, and thus does not require that users have coding experience. Version 0.7.0 includes machine learning algorithms for regression, classification, clustering, dimension reduction and anomaly detection. After either automatic or manual parameter tuning, the automated Python framework provides users with performance and prediction results for the trained machine learning model. Based on the scikit-learn library, Geochemistry π has established a customized automated process for implementing machine learning. The Python framework enables extensibility and portability by constructing a hierarchical pipeline architecture that separates data transmission from algorithm application. The AutoML module is constructed using the Cost-Frugal Optimization and Blended Search Strategy hyperparameter search methods from the A Fast and Lightweight AutoML Library, and the model parameter optimization process is accelerated by the Ray distributed computing framework. The MLflow library is integrated into machine learning lifecycle management, which allows users to compare multiple trained models at different scales and manage the data and diagrams generated. In addition, the front-end and back-end frameworks are separated to build the web portal, which demonstrates the machine learning model and data science workflow through a user-friendly web interface. In summary, Geochemistry π provides a Python framework for users and developers to accelerate their data mining efficiency with both online and offline operation options. All source code is available on GitHub  (https://github.com/ZJUEarthData/geochemistrypi), with a detailed operational manual catering to both users and developers (https://geochemistrypi.readthedocs.io/en/latest/).

How to cite: ZhangZhou, J. Z.: Geochemistry π: Machine Learning for Geochemists Who Don’t Want to Code, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5192, https://doi.org/10.5194/egusphere-egu26-5192, 2026.

 

Advances in computing, statistics, and machine learning (ML) techniques have significantly changed research practices across disciplines. Despite Fortran’s continued importance in scientific computing and long history in data-driven prediction, its statistics and ML ecosystem remains thin. FSML (Fortran Statistics and Machine Learning) is developed to address this gap and make data-driven research with Fortran more accessible. 

The following points are considered carefully in its development and each come with their own challenges, solutions, and successes: 

  • Good sustainable software development practices: FSML is developed openly, conforms to language standards and paradigms, uses a consistent coding and comment style, and includes examples, tests, and documentation. A contributor’s guide ensures consistency for future contributions. 
  • Accessibility: FSML keeps the code clean and simple, avoids overengineering, and has minimal requirements. Additionally, an example-rich html documentation and tutorials are automatically generated with the FORtran Documenter (FORD) from code, comments, and simple markdown documents. Furthermore, it is developed to support compilation with LFortran (in addition to GFortran), so it can be used interactively like popular packages for interpreted languages. 
  • Community: FSML integrates community efforts and feedback. It uses the linear algebra interfaces of Fortran’s new de-facto standard library (stdlib) and the fortran package manager (fpm) for easy building and distribution. Its permissive licence (MIT) allows developers to integrate FSML into their projects without the restrictions often imposed by other licenses. Its simplicity, documentation, contributor’s guide, and GitHub templates remove barriers for new contributors and users. 
  • Communication: FSML updates are shared through a variety of methods with different communities. This includes a journal article (https://doi.org/10.21105/joss.09058) for visibility among academic colleagues, frequently updated online documentation (https://fsml.mutz.science/), social media updates, as well as a blog and Fortran Discourse posts to keep Fortran’s new and thriving online community updated. 

Early successes of FMSL’s approach and design include: 1) Students with little coding experience were able to learn the language and use library with only Fortran-lang’s tutorials and FSML’s documentation; 2) early career researchers with no prior experience in Fortran used FSML’s functions to conduct research for predicting future climate extremes; 3) FSML gained a new contributor and received a pull request only days after its first publicised release. 

The development of FSML demonstrates the merits of using good and open software development practices for academic software, as well as the potential of using the new Fortran development ecosystem and building bridges to the wider (non-academic) developer community. 

How to cite: Mutz, S. G.: Developing a modern Fortran statistics and machine learning library (FSML) , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5393, https://doi.org/10.5194/egusphere-egu26-5393, 2026.

EGU26-6222 | ECS | Posters on site | ESSI3.4

Preparing for an Operational Environment: Software Development Standards in the Integrated Greenhouse Gas Monitoring System for Germany 

Diego Jiménez de la Cuesta Otero and Andrea Kaiser-Weiss

Modern scientific projects typically rely on software, e.g., for implementing numerical models, performing data pre- and postprocessing, solving inverse problems, or assimilating observations. Consequently, the reliability and reproducibility of scientific results critically depend on software quality. Scientific results are also intended to be shared or reused, and so is the software that produces them: especially in operational settings, where traceability and maintainability are essential. Therefore, a sustainable software development strategy becomes key to a project's success. Nevertheless, often software standards are treated as a secondary concern. This can lead to difficulties when introducing new features, delays in users' projects, limited reproducibility, strained collaborations, and ultimately lack of suitability for operational use.
 
We present the case of the German Weather Service (DWD) contributions within the Integrated Greenhouse Gas Monitoring System for Germany (ITMS). The primary objective of ITMS is the verification of greenhouse gas emissions, which imposes particularly high requirements on the results' traceability and reproducibility. Accordingly, most if not all software-based components of our system should adhere to software development standards that ensure these requirements. We provide an overview of our software development standards and their application, and discuss lessons learned that are transferable to both legacy and newly developed scientific software projects.

How to cite: Jiménez de la Cuesta Otero, D. and Kaiser-Weiss, A.: Preparing for an Operational Environment: Software Development Standards in the Integrated Greenhouse Gas Monitoring System for Germany, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6222, https://doi.org/10.5194/egusphere-egu26-6222, 2026.

EGU26-7565 | Orals | ESSI3.4

The Modular Earth Submodel System (MESSy): lessons learned from 20+ years of continuous development 

Patrick Jöckel, Astrid Kerkweg, Kerstin Hartung, and Bastian Kern

Earth System Models (ESMs) aim at replicating the essence of the Earth Climate System in numerical simulations on high performance computing (HPC) systems. The underlying software is often rather complex, comprising several source code entities (modules and libraries, sometimes combining different programming languages), and has in many cases grown over decades. ESMs are usually structured as “multi-compartment” models, i.e. disassembled into a set of different components, each of which describes a different compartment in the Earth System, such as the atmosphere, the land surface, the ocean, the cryosphere, the biosphere, etc. Each compartment model, in turn, comprises a series of algorithms (numerical solvers, parametrizations), each of which represents a specific physical, chemical or socio-economic process. The behaviour of the “system as a whole” (i.e., the development of its state over time, its response to perturbations) is characterized by non-linear interactions and feedbacks between the different compartments and processes.

The implementation of such numerical models representing these inter-compartment and inter-process connections (i.e. the coupling) poses a challenging task for the software development, in particular given the need for (scalable) continuous further development and integration of new components, aiming at keeping pace with our knowledge about the real Earth System. Common requirements to such software are maintainability, sustainability (e.g. for new HPC architectures), resource efficiency (performance at run-time), but also development scalability.

More than twenty years ago (in 2005) we proposed the Modular Earth Submodel System (MESSy) as a potential new approach to Earth System modelling. Here, we present how we started as an “atmospheric chemistry add-on” to a specific General Circulation Model, but already with a wider range of applications in mind. We further show, how we went through our 2nd development cycle, finally arriving at our current state, the MESSy Integrated Framework that is soon to be released Open Source. Although our 4 major software design principles (will be presented!) did not change significantly from the early stage, we had to undergo several implementation revisions to reach its current state. Despite the continuous development, MESSy was always “state-of-the art” and “in operation”, i.e. used for scientific research. Thus, in retrospect, we present some of the milestones achieved by “pragmatic” software engineering in practice.

How to cite: Jöckel, P., Kerkweg, A., Hartung, K., and Kern, B.: The Modular Earth Submodel System (MESSy): lessons learned from 20+ years of continuous development, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7565, https://doi.org/10.5194/egusphere-egu26-7565, 2026.

EGU26-7637 | Posters on site | ESSI3.4

Insights and tips for maintainability, robustness, usability, and reproducibility of geo-scientific models 

Konstantin Gregor, Benjamin Meyer, Joao Darela-Filho, and Anja Rammig

The complexity of geoscientific models, from pre-processing, model execution, and post-processing, poses major challenges to maintainability, reproducibility, and accessibility, even when FAIR data principles are followed.

Based on a survey of the 20 dynamic global vegetation models participating in the Global Carbon Project, we present the current state of, and potential improvements in, practices of software engineering and reproducibility within the community.
We also share notable successful practices from the community that could be helpful for all geo-scientists, including
- version control
- workflow management systems
- containerization
- automated documentation
- continuous integration
- automated visualizations

These approaches enable reproducible, portable, and automated workflows, improve code reliability, and enhance access to scientific results.

We conclude with a showcase of a fully reproducible and portable workflow implemented for one model, illustrating how these practices can be implemented by other modeling communities. This example can serve as a practical resource for improving reproducibility, accessibility, and software engineering standards across the geosciences.

How to cite: Gregor, K., Meyer, B., Darela-Filho, J., and Rammig, A.: Insights and tips for maintainability, robustness, usability, and reproducibility of geo-scientific models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7637, https://doi.org/10.5194/egusphere-egu26-7637, 2026.

EGU26-8659 | Orals | ESSI3.4

Improving long-term maintainability of the ACCESS models while transitioning to new architectures: challenges and opportunities 

Micael J. T. Oliveira, Edward Yang, Manodeep Sinha, and Kelsey Druken

Australia’s Climate Simulator, ACCESS-NRI, is Australia’s National Research Infrastructure (NRI) for climate modelling, supporting the development and community use of the Australian Community Climate and Earth System Simulator (ACCESS). 

As the ACCESS modelling system evolves to meet user requirements, so does the basic infrastructure that underpins our ability to efficiently run the models, with HPC architectures rapidly shifting towards GPUs, and new developments in Machine Learning disrupting how new models are developed and used. Under such circumstances, it's easy for scientists and software engineers to focus on more pressing matters and spend less time worrying about software maintainability. Although such type of "tactical" programming might bring benefits in the short term, long-term software maintainability and sustainability requires a more strategic approach. 

Using ACCESS-NRI as a case study, this presentation argues that addressing these challenges is not about any single tool or practice, but about adopting an integrated and coordinated strategy for scientific software development. I will describe how ACCESS-NRI is tackling these challenges by bridging skills and training gaps between scientists and software engineers, adopting well-established industry standards where appropriate (e.g. CMake, Git), and embedding software engineering best practices across development workflows. Alongside these technical efforts, addressing the social challenges of collaboratively developing large, open-source software is a key part of our approach, ensuring contributors can work effectively towards shared goals. 

A concrete example is GPU porting within the ACCESS modelling system. Successfully porting code to GPUs has required close collaboration with existing code owners, careful consideration of scientific and performance constraints, and a strong emphasis on avoiding divergent code paths that are difficult to maintain. This experience highlights the importance of the social dimensions of software development: changes cannot simply be imposed, but must be developed collaboratively to balance reliability, performance, portability, and long-term sustainability. 

By reflecting on what has worked—and what has not—this talk aims to share practical lessons that are transferable to other scientific software projects as they grow beyond small research teams into widely used, community-supported systems.

How to cite: Oliveira, M. J. T., Yang, E., Sinha, M., and Druken, K.: Improving long-term maintainability of the ACCESS models while transitioning to new architectures: challenges and opportunities, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8659, https://doi.org/10.5194/egusphere-egu26-8659, 2026.

EGU26-8712 | Orals | ESSI3.4

 Modern tools to scale the compilation, testing and deployment of scientific software  

Aidan Heerdegen, Tommy Gatti, Harshula Jayasuriya, Thomas McAdam, Johanna Basevi, and Kelsey Druken

Modern software development practices such as continuous integration compilation, testing and deployment are a requirement for robust and trusted climate model development. However, this can be very challenging to achieve with climate models that often include legacy code requiring very specific versions of scientific libraries and that must run on complex HPC systems.  In addition, climate models have very long support timeframes (5+ years), with a requirement for absolute bitwise reproducibility, which requires precise control and provenance of the entire software stack. 

Australia’s Climate Simulator (ACCESS-NRI), is a national research infrastructure tasked with supporting the development and use of the Australian Community Climate and Earth System Simulator (ACCESS) model suite for the research community. At ACCESS-NRI we use spack, a build from source package manager targeting HPC, to create infrastructure to easily build ACCESS climate models and their supporting software stacks with full provenance and build reproducibility.  

Now the challenge for us at ACCESS-NRI, as an infrastructure supporting a wide range of user needs, is to scale this effort to multiple models, with many permutations of components and versions, without creating a very large support burden for our software engineers.  

We do this by focusing on modularity and generic workflows to achieve our desired scale efficiently. Spack's modular design has meant ACCESS-NRI has been able to create entirely generic GitHub workflows for building, testing and deploying many climate models on our target HPC, Australia’s National Computational Infrastructure (NCI), as well as run test builds on standard Linux virtual machines.  

As a result there is dramatically less support burden, as the CI/CD code is centralised and maintained in one location, and reused in many places. It is also extremely simple to add CI testing for new model components with just a few lines of GitHub Actions code. 

The choice of tools allowing a focus on a modular approach and generic workflows has been validated: we currently support seven models, with nineteen discrete components, and have grown from one deployment in 2023, eleven in 2024 and now twenty-nine in 2025,  as well as many thousands of pre-release test builds in the last quarter alone. This gives us confidence that we can continue to scale efficiently, without a large support burden requiring onerous resourcing that might otherwise place a technical limit on future activities. 

How to cite: Heerdegen, A., Gatti, T., Jayasuriya, H., McAdam, T., Basevi, J., and Druken, K.:  Modern tools to scale the compilation, testing and deployment of scientific software , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8712, https://doi.org/10.5194/egusphere-egu26-8712, 2026.

EGU26-9441 | Posters on site | ESSI3.4

The Data-to-Knowledge Package - A Framework for publishing reproducible and reusable analysis workflows in Earth System Science 

Markus Konkol, Simon Jirka, Sami Domisch, Merret Buurman, Vanessa Bremerich, and Astra Labuce

More and more funders, reviewers, and publishers ask researchers to follow Open Science principles and make their research results publicly accessible. In the case of a computational analysis workflow, this means providing access to data and code that produced the figures, tables, and numbers reported in a paper. However, doing so, even in consideration of the FAIR Principles, does not mean others can easily reuse the materials and continue the research. It still requires effort to understand an analysis script (e.g., written in R or python) and extract those parts of a workflow (i.e. the code snippets) that generate, for instance, a particular figure.

In this contribution, we demonstrate the concept and realization of the Data-to-Knowledge Package (D2K-Package), a collection of digital assets which facilitate the reuse of computational research results [1]. The heart of a D2K-Package is the reproducible basis composed of the data and code underlying, for instance, a statistical analysis. Instead of simply providing access to the analysis script as a whole, the idea is to structure the code into self-contained and containerized functions making the workflow steps more reusable. Each function follows the input-processing-output-logic and fulfills a certain task such as data processing, analysis, or visualization. Creating such a reproducible basis allows inferring the following components that are also part of the D2K-Package:

A virtual lab is a web application, for example, in the form of a JupyterLab environment provided with the help of MyBinder. Users can access it via the browser and obtain a computational environment with all dependencies and the runtime pre-installed. Creating such a virtual lab is possible since all code is containerized and the image is built based on a specification of the used libraries, runtime, and their versions. A virtual lab can help users with programming expertise to engage with the code in a ready-to-use programming environment.

A web API service exposes the encapsulated and self-contained functions such that every function has a dedicated URL endpoint. Users can send requests from their analysis script to that endpoint and obtain the results via HTTP. Hence, they can reuse the functions without copying the code snippets or struggling with dependencies. Such a service can be realized using OGC API Processes and pygeoapi.

The computational workflow connects the functions to an executable analysis pipeline and acts as an entry point to a complex analysis. Such a workflow can help users obtain a better understanding of the functions and relevant input parameters. By using workflow tools such as the Galaxy platform, also users without programming experience receive the chance to change the parameter configuration and see how the new settings affect the final output.

Besides the concepts as outlined above, this contribution will also report on real demonstrators which showcase the idea of a D2K-Package.

This project has received funding from the European Commission’s Horizon Europe Research and Innovation programme. Grant agreement No 101094434.

1) Paper: Konkol et al. (2025) https://doi.org/10.12688/openreseurope.20221.3

How to cite: Konkol, M., Jirka, S., Domisch, S., Buurman, M., Bremerich, V., and Labuce, A.: The Data-to-Knowledge Package - A Framework for publishing reproducible and reusable analysis workflows in Earth System Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9441, https://doi.org/10.5194/egusphere-egu26-9441, 2026.

EGU26-10884 | Posters on site | ESSI3.4

Teaming up as domain scientists and research software engineers for a sustainable HELIOS++ scientific software 

Dominic Kempf, Hannah Weiser, Dmitrii Kapitan, and Bernhard Höfle

The Heidelberg LiDAR Operations Simulator (HELIOS) is a scientific software for high-fidelity general-purpose virtual laser scanning (VLS) [1]. Using models for virtual scenes, scanner devices, and platforms, HELIOS allows to reproduce diverse scan scenarios over various geographical environments (forests, cities, mountains) and laser scanning systems (airborne and UAV-borne, mobile, terrestrial). Used for algorithm development, data acquisition planning, and method training for supervised machine learning, HELIOS has been successfully integrated into research workflows across the international laser scanning community.

HELIOS was initially developed in a research-driven environment in Java and released as open-source software [2]. Motivated by growing interest in the scientific community, the codebase was re-implemented in C++ to improve its memory footprint, runtime performance and functionality [3]. Since then, we are actively developing new features. Recent additions include support for dynamic scenes [4], new deflector mechanisms, and plug-ins for other open-source software such as Blender. Considering the continually growing user community, current software development specifically prioritizes quality assurance, reliability, long-term maintainability, and user-friendliness.

Supported by the DFG under the program "Research Software - Quality Assured and Re-usable" [5], the HELIOS++ developer team partnered up with the Scientific Software Center (SSC), a research software engineering service department at Heidelberg University. Combining the expertise of the domain scientist from the HELIOS team and the research software engineers (RSEs) of the SSC, we are strengthening the sustainability and usability of HELIOS. Measures presented in our talk include: Improving testing strategies and Continuous Integration, rewriting the CMake build system, packaging HELIOS as a Conda package, creating standalone installers, introducing a new Python API, and developing new strategies for sharing and reproducing HELIOS simulations. Additionally, we will reflect on the benefits as well as key challenges in fostering fruitful collaborations between domain scientists and RSEs. To this end, we will present as a domain scientist/RSE tandem.

References:

[1] HELIOS++: https://github.com/3dgeo-heidelberg/helios

[2] Bechtold, S., & Höfle, B. (2016): https://doi.org/10.5194/isprs-annals-III-3-161-2016

[3] Winiwarter, L et al. (2022): https://doi.org/10.1016/j.rse.2021.112772

[4] Weiser, H., & Höfle, B. (2026): https://doi.org/10.1111/2041-210x.70189

[5] Project website: https://www.geog.uni-heidelberg.de/en/3dgeo/projects-of-the-3dgeo-research-group/fostering-a-community-driven-and-sustainable-helios-scientific-software

How to cite: Kempf, D., Weiser, H., Kapitan, D., and Höfle, B.: Teaming up as domain scientists and research software engineers for a sustainable HELIOS++ scientific software, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10884, https://doi.org/10.5194/egusphere-egu26-10884, 2026.

EGU26-12310 | Posters on site | ESSI3.4

WRF-Chem-Polar: an open, collaborative, and reproducible framework for modeling the polar atmosphere 

Jennie L. Thomas, Lucas Bastien, Ruth Price, Rémy Lapere, Ian Hough, Erfan Jahangir, Lucas Giboni, and Louis Marelle

Over the past 15 years, substantial developments have been made to adapt the regional chemistry-climate model WRF-Chem for applications in polar environments, with a main focus on the Arctic. These developments address key processes that are either absent from, or insufficiently represented in, the standard WRF-Chem distribution, particularly those controlling aerosol-cloud interactions, boundary layer chemistry, and surface-atmosphere coupling over snow, sea ice, and the polar ocean. However, until now, these advances have been distributed across multiple publications, code branches, and project-specific implementations, limiting transparency, reproducibility, and community use.

Here we present WRF-Chem-Polar, a consolidated and openly available modeling framework that integrates our polar-specific model developments into a single, traceable code base. The framework is hosted on GitHub and is structured around two tightly linked components: (i) a unified WRF-Chem-Polar model code that incorporates developments for polar aerosol and cloud processes and (ii) a dedicated infrastructure for compiling, running, and analyzing simulations.

A key objective of WRF-Chem-Polar (including the model code and infrastructure) is to enable transparent model evolution. All developments are tracked through version control, with automated test cases designed to systematically compare model behavior across code versions. This approach allows scientific changes to be evaluated quantitatively, supports regression testing, and facilitates controlled experimentation when introducing new parameterizations or process representations. The infrastructure also provides transparent workflows for simulation setup, post-processing, and diagnostics, improving reproducibility across users and platforms. Code quality, readability, and consistency is improved via coding style guides and modern software tools that include unit testing and automatic enforcement of linting rules.

By making these developments openly accessible and actively maintained, WRF-Chem-Polar lowers the barrier for the community to apply advanced polar chemistry–aerosol–cloud representations, while providing a robust framework for continued development and evaluation. This effort supports both fundamental process studies and applied research and contributes to broader open-science and FAIR modeling and furthers our objective of uptake of our work within the Earth system modeling community.

How to cite: Thomas, J. L., Bastien, L., Price, R., Lapere, R., Hough, I., Jahangir, E., Giboni, L., and Marelle, L.: WRF-Chem-Polar: an open, collaborative, and reproducible framework for modeling the polar atmosphere, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12310, https://doi.org/10.5194/egusphere-egu26-12310, 2026.

EGU26-13932 | Orals | ESSI3.4

Software as Scientific Infrastructure: CIG’s Role  in Computational Geodynamics and Lessons from Developing ASPECT 

Rene Gassmöller, Wolfgang Bangerth, Juliane Dannberg, Daniel Douglas, Menno Fraters, Anne Glerum, Timo Heister, Lorraine Hwang, Robert Myhill, John Naliboff, Arushi Saxena, and Cedric Thieulot

Modeling software is integral to computational geodynamics, enabling quantitative investigation of planetary mantle, lithosphere and core dynamics across a wide range of spatial and temporal scales. Over the past two decades, the field’s software ecosystem has shifted significantly: codes that were once developed and maintained within single research groups have increasingly evolved into large, modular packages sustained by multi-institutional and often international collaborations. One important factor in this transition has been the establishment of community organizations like the Computational Infrastructure for Geodynamics (CIG), which has provided coordination and shared capacity that individual groups typically cannot sustain on their own.
In this contribution, I highlight benefits and lessons learned from work within CIG and from the development of the geodynamic modeling software ASPECT (Advanced Solver for Planetary Evolution, Convection, and Tectonics). Community organizations can accelerate scientific software development in several ways. Shared infrastructure (project landing pages, established user forums) improves discoverability and supports software adoption by the community. Targeted support, including seed funding, helps projects invest in feature development and maintenance. By streamlining software release and distribution and promoting robust development and testing workflows, community organizations improve software quality and reliability. Training the next generation of computational geoscientists through workshops, tutorials, and user support, builds shared expertise and makes community software more sustainable. Collectively, these activities reduce duplicated effort, lower barriers to entry for new users and contributors, and create pathways for software to evolve in step with scientific and numerical-method advances.
ASPECT provides a concrete example of this community-driven model. Designed to simulate thermal convection with a primary emphasis on Earth’s mantle, it has now been used for a broad range of applications including crustal deformation, magma dynamics, and fluid flow, convection on icy satellites, deformation of the inner core, and digital twins of mineral physics experiments. This widening scope has been possible because ASPECT prioritizes usability and extensibility, to accommodate evolving model complexity, and leverages modern numerical methods such as adaptive mesh refinement and robust linear/nonlinear solvers. From the start, ASPECT has been designed for large-scale parallel simulations required for problems with small-scale features embedded in mantle-scale domains.  It also strategically builds on established external libraries (e.g., deal.II, Trilinos, p4est) rather than re-implementing core algorithms. ASPECT’s success has been enabled by a well-tested framework, extensive documentation, a plugin architecture that simplifies customization, and active encouragement of community contributions through support and recognition. Together, these elements illustrate how organizational infrastructure and software design choices support long-term development and continued methodological innovation in geodynamic modeling, enabling robust simulations that address increasingly complex scientific questions.

How to cite: Gassmöller, R., Bangerth, W., Dannberg, J., Douglas, D., Fraters, M., Glerum, A., Heister, T., Hwang, L., Myhill, R., Naliboff, J., Saxena, A., and Thieulot, C.: Software as Scientific Infrastructure: CIG’s Role  in Computational Geodynamics and Lessons from Developing ASPECT, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13932, https://doi.org/10.5194/egusphere-egu26-13932, 2026.

Process‑based models that explicitly couple soil water and heat transport, canopy radiative transfer, photosynthesis, and surface–atmosphere exchange are increasingly used to connect in‑situ observations with remote‑sensing–relevant land‑surface processes. However, their practical adoption—particularly in heterogeneous urban environments—remains challenging due to complex software dependencies, fragmented preprocessing pipelines, and limited transparency in model configuration. These challenges are exacerbated when such models are accessed through low‑level implementations that are difficult to adapt, reproduce, or extend by domain scientists.

We present rSTEMMUS‑SCOPE, an open‑source R interface to the coupled STEMMUS‑SCOPE modelling framework, designed to apply good practices in scientific software development to a hybrid soil–canopy model that is frequently used by practitioners and researchers interested in ecohydrology, urban climate, and remote sensing. The interface lowers barriers for reproducible experimentation by providing a modular, script‑based workflow that integrates eddy‑covariance forcing, in‑situ soil measurements, vegetation parameters, and multilayer soil discretisation within a transparent R‑based environment that supports from data pre-processing to the visualization of the results.

From a software‑engineering perspective, rSTEMMUS‑SCOPE adopts a modular, script‑based architecture that separates data inputs, model settings, execution, and post‑processing. The package provides reproducible pipelines for preprocessing eddy‑covariance meteorological forcing, precipitation, vegetation parameters, and multilayer soil discretisation (>50 layers), enabling fully scripted end‑to‑end simulations within R. Version‑controlled configuration files, consistent function interfaces, and documented defaults are used to support transparency and extensibility, while example workflows and vignettes lower the entry barrier for users who are domain scientists rather than trained software developers. The design follows a “user‑turned‑developer” paradigm, allowing advanced users to adapt parameterisations and forcing strategies while preserving a stable core interface.

We demonstrate these design choices using an urban case study in a temperate green space in Berlin, where hourly simulations were performed for 2019–2020. Observations from an eddy‑covariance tower and in‑situ soil moisture sensors are used as a software stress test rather than as the primary scientific result. Volumetric soil water content at 60 cm depth was reproduced well (Kling–Gupta Efficiency = 0.82; r = 0.88; α = 1.01), while simulated evapotranspiration captured diurnal and seasonal dynamics (r ≈ 0.67), with systematic biases during low‑energy conditions. Sensitivity experiments illustrate how differences in input data sources and parameter choices propagate through the modelling workflow, highlighting the importance of transparent, reproducible pipelines for diagnosing model behaviour.

We conclude by discussing practical lessons learned in wrapping complex process‑based models in high‑level languages: trade‑offs between modularity and performance, documenting urban‑specific parameter choices without constraining expert use, and testing strategies when upstream physics models are computationally expensive. rSTEMMUS‑SCOPE demonstrates how applying robust software practices enables meaningful, reproducible results and supports early‑career researchers working at the interface of modelling, data, and urban environmental science.

Software availability

rSTEMMUS‑SCOPE (open source): https://github.com/EcoExtreML/rSTEMMUS_SCOPE

How to cite: Duarte Rocha, A. and Aljoumani, B.: rSTEMMUS‑SCOPE: a user‑friendly open‑source R package wrapping a coupled soil–canopy process-based model for urban soil‑moisture and ET — good practices and lessons learned, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15058, https://doi.org/10.5194/egusphere-egu26-15058, 2026.

EGU26-16877 | Orals | ESSI3.4

Beyond Good Practices: Designing Scientific Software for Contribution and Reuse 

Eric Hutton, Gregory Tucker, Mark Piper, and Tian Gan

Lowering the barrier to scientific contribution requires more than adopting good software practices; it requires software structures and standards that make contribution and reuse safe, scoped, and sustainable. We describe how the Community Surface Dynamics Modeling System (CSDMS) addresses these challenges through two complementary efforts: the Landlab modeling framework and the Basic Model Interface (BMI).

Landlab is a Python package designed as a platform for building Earth-surface process models. Over time, we discovered its architecture also promoted the user-turned-developer pathway, which has been critical to its success. While good software practices such as automated testing, continuous integration, documentation, and linting provide a foundation of reliability, Landlab’s component-based architecture has been central to enabling contribution. This design offers contributors clearly scoped and isolated entry points for adding new process models without needing to understand or modify the entire codebase. By enabling contributions from a growing set of domain experts and supporting them through shared maintenance infrastructure, this model expands the pool of invested contributors and reduces reliance on a small number of core developers, strengthening the prospects for long-term project sustainability.

The Basic Model Interface (BMI) complements this approach by providing a lightweight, language-agnostic interface standard that defines how models expose their variables, parameters, and time-stepping controls to the outside world. By separating scientific algorithms from model orchestration, BMI enables models to be reused, coupled, and tested across different frameworks without requiring changes to their internal implementations. Ongoing, community-guided work toward BMI 3.0 aims to extend these capabilities by improving support for parallel execution, clearer state management, and optional interface extensions.

Together, Landlab and BMI illustrate how framework design and community-driven standards can reduce technical debt and enable researchers to contribute reusable and interoperable software without requiring them to become full-time software engineers.

How to cite: Hutton, E., Tucker, G., Piper, M., and Gan, T.: Beyond Good Practices: Designing Scientific Software for Contribution and Reuse, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16877, https://doi.org/10.5194/egusphere-egu26-16877, 2026.

EGU26-17128 | Posters on site | ESSI3.4

A modularized workflow for processing heterogeneous agricultural land use data 

Antonia Degen, Yi-Chen Pao, and Andrea Ackermann

In Germany each federal state is committed to collect required information on funding, farming practices and land use with an “Integrated Administration and Control System” (IACS) (Deutscher Bundestag 2014).

Based on the land parcel identification system (LPIS) as one of the core elements of IACS (European Commission, 2025), georeferenced data along with ancillary data are collected annually since 2005. Mandatory requirements for checks and on-site validations ensure a high data quality which makes IACS data very suitable for research purposes (Leonhardt 2024). Our goal is to create a nation-wide timeseries based on IACS data, that contains detailed information on land use, animal husbandry and farm statistics and can be used for comprehensive land use, soil, agricultural-policy and biodiversity research. Despite this, IACS data remain underused for scientific research due to the following challenges:

  • Data protection: Obtaining and handling IACS data requires a legal agreement between the research project and the respective federal state including Data Usage Agreements.
  • Data heterogeneity: All federal states have unique data processing workflows and historical changes in processing practices resulting in different data-types, -formats, structure, keys, encodings, etc.
  • Data volume: Large storage volume, processing capacities and back-up systems with high security levels are required. Efficiency and data minimization is an important framework for the design of the processing workflows.

 

In this contribution we as user-turned-developers, want to show how we utilize our toolbox of open-source software (Linux, Bash, R, PostgreSQL/PostGIS, Python, GitLab), for a suitable modularized workflow to meet these challenges.

The first module is tailored to pre-process the data to its specific federal state qualities. Module two and three contain more general functions to grant machine readability. All data is then processed in a data cleaning workflow and imported into our PostgreSQL/PostGIS database.

We use our database for data harmonization by implementing modularized functions to handle different use cases.

The resulting harmonized datasets are provided to research teams with data protection clearance for federal state and year respectively. Harmonized tables are versioned as releases, to either grant reproducibility as well as to provide necessary updates.

Figure 1 Modularized workflow for IACS data processing towards a nation-wide harmonized timeseries

Reproducibly is granted by using script-based procedures that are stored and versioned in GitLab as well as extensive code documentation and automized file-based processing documentation.

Our modularization process lays the foundation for sustainable handling of complex administrative agricultural data and is a first step towards a software development approach.

Literature

European Commission (2025): Integrated Administration and Control System (IACS). Online available  https://agriculture.ec.europa.eu/common-agricultural-policy/financing-cap/assurance-and-audit/managing-payments_en

Deutscher Bundestag (2014): Gesetz über die Verarbeitung von Daten im Rahmen des Integrierten Verwaltungs- und Kontrollsystems nach den unionsrechtlichen Vorschriften für Agrarzahlungen. InVeKoS- Daten-Gesetz - InVeKoSDG, vom 5 (2019). Online available: https://www.gesetze-im-internet.de/invekosdg_2015/

Heidi Leonhardt, Maximilian Wesemeyer, Andreas Eder, Silke Hüttel, Tobia Lakes, Henning Schaak, Stefan Seifert, Saskia Wolff (2024): Use cases and scientific potential of land use data from the EU’s Integrated Administration and Control System: A systematic mapping review, Ecological Indicators, Volume 167, ISSN 1470-160X, https://doi.org/10.1016/j.ecolind.2024.112709.

How to cite: Degen, A., Pao, Y.-C., and Ackermann, A.: A modularized workflow for processing heterogeneous agricultural land use data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17128, https://doi.org/10.5194/egusphere-egu26-17128, 2026.

EGU26-17569 | Orals | ESSI3.4

Latest Developments in Probtest: Probabilistic Testing for Robust CPU/GPU Validation of Scientific Models 

Annika Lauber, Chiara Ghielmini, Daniel Hupp, and Claire Merker

Porting large numerical models to heterogeneous computing architectures introduces significant challenges for software validation and testing, as results from CPU- and GPU-based executions are typically not bit-identical. These differences arise from variations in floating-point arithmetic, execution order, and the use of architecture-specific mathematical libraries. Traditional regression testing approaches based on exact reproducibility therefore become inadequate, particularly in continuous integration (CI) workflows.

Probtest is a lightweight testing framework developed to address this problem in the ICON numerical weather and climate model. It implements a probabilistic, tolerance-based testing strategy that enables robust numerical consistency checks between CPU and GPU runs while remaining fast and resource-efficient. Tolerances are derived from ensembles generated by perturbing prognostic variables in the initial conditions. From a larger ensemble of CPU reference runs, a representative subset is selected to compute variable-specific tolerance ranges that define acceptable numerical deviations. This approach allows reliable validation across architectures without constraining model development or optimization.

Recent developments focus on improving extensibility, usability, and reproducibility. Support for Feedback Output Files (FOF) has been added, enabling consistency checks for observation-based diagnostics in addition to model state variables. Furthermore, Probtest has been fully containerized, with each release published on Docker Hub. This removes local installation barriers, ensures reproducible testing environments, and simplifies integration into CI pipelines and collaborative development workflows. These developments strengthen Probtest as a practical and portable tool for validating ICON across heterogeneous computing platforms.

How to cite: Lauber, A., Ghielmini, C., Hupp, D., and Merker, C.: Latest Developments in Probtest: Probabilistic Testing for Robust CPU/GPU Validation of Scientific Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17569, https://doi.org/10.5194/egusphere-egu26-17569, 2026.

EGU26-17829 | Posters on site | ESSI3.4

Evolution of the EPOS Platform Open Source 

Marco Salvi, Valerio Vinciarelli, Rossana Paciello, Daniele Bailo, Alessandro Crocetta, Kety Giuliacci, Manuela Sbarra, Alessandro Turco, Mario Malitesta, Jean-Baptiste Roquencourt, Martin Carrere, Jan Michalek, Baptiste Roy, and Christopher Card

The development of sustainable and reusable scientific software infrastructures remains a significant challenge in geosciences, particularly when transitioning from single-purpose systems to platforms intended for broader community adoption. This presentation shares experiences and lessons learned from developing the EPOS Platform as an open-source, reusable data integration and visualization system, demonstrating how intentional architectural decisions and tooling investments can transform research infrastructure software into widely adoptable solutions.

The EPOS Platform (European Plate Observing System) initially served as the technical backbone for EPOS ERIC (https://www.epos-eu.org/epos-eric), providing integrated access to solid Earth science data across ten thematic domains. Built on a choreography architecture using Docker and Kubernetes, the system successfully fulfilled its original mandate. However, as other research infrastructures expressed interest in similar capabilities, we recognized the potential for broader impact and initiated a strategic shift toward creating a genuinely reusable open-source platform.

The transition required addressing fundamental challenges in software reusability. Initially, deployment necessitated manual configuration and deep infrastructure knowledge, creating significant adoption barriers. To overcome this, we developed the epos-opensource CLI tool (https://github.com/EPOS-ERIC/epos-opensource), a command-line interface with an integrated terminal user interface (TUI) that reduces deployment from a complex manual process to a single command. This tool enables researchers and developers to deploy fully functional instances locally using either Docker Compose or Kubernetes, significantly accelerating both external adoption and internal development workflows.

We released the complete platform under GPL v3 license, ensuring that all code, including that powering the production EPOS Platform (https://www.ics-c.epos-eu.org/), remains open and community-accessible. Within EPOS ERIC, the open-source release and deployment tooling facilitate rapid provisioning of testing environments for developers and metadata contributors. Comprehensive documentation was developed using Docusaurus, following standard open-source practices to provide installation guides, system architecture references, and user tutorials. The EPOS Platform Open Source has been leveraged to enhance data sharing by multiple research initiatives, including ENVRI-Hub NEXT (https://envri.eu/envri-hub-next/), DT-GEO (https://dtgeo.eu/), IPSES (https://www.ipses-ri.it), and Geo-INQUIRE (https://www.geo-inquire.eu/), demonstrating the platform's versatility across different research contexts.

Our experience demonstrates that developing reusable scientific software requires deliberate investment beyond initial functionality. Key factors include comprehensive documentation following community standards, simplified deployment through user-friendly tooling, architectural flexibility for diverse use cases, and genuine open-source practices where production and community code remain unified. These principles, while resource-intensive, are essential for scientific software to achieve meaningful impact and contribute to a more sustainable, collaborative research infrastructure ecosystem.

This presentation will explore the evolution of the EPOS Platform Open Source, demonstrating how strategic investments in deployment tooling, comprehensive documentation, and architectural flexibility enabled the transformation from a single-purpose infrastructure to a widely adoptable community resource.

How to cite: Salvi, M., Vinciarelli, V., Paciello, R., Bailo, D., Crocetta, A., Giuliacci, K., Sbarra, M., Turco, A., Malitesta, M., Roquencourt, J.-B., Carrere, M., Michalek, J., Roy, B., and Card, C.: Evolution of the EPOS Platform Open Source, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17829, https://doi.org/10.5194/egusphere-egu26-17829, 2026.

EGU26-20382 | Posters on site | ESSI3.4

User-turned-developer: Scientific software development for a national nutrient policy impact monitoring in Germany 

Max Eysholdt, Maximilian Zinnbauer, and Elke Brandes

Many countries in the EU fail to protect their waters adequately from nitrogen and phosphorus inputs (European Environment Agency. 2024), often originating from agricultural sources (Sutton 2011). Germany was found guilty by the European Court of Justice for insufficient implementation of the EU Nitrates Directive, for protection of waters from nutrient pollution from agriculture (European Court of Justice 2018). In response, Germany introduced a monitoring system for assessing the impact of the recently updated application ordinance, which implements the EU Nitrates Directive. This monitoring creates time series of pollution-related spatial indicators ranging from land use to modelled nutrient budgets. Input data on land use sources the Integrated Administration and Control System. The results are used by German authorities for reporting to the EU as well as national and regional water protection policy.

We present the technical concept, infrastructure and workflows established for this data-intensive, long-term project and discuss challenges and limitations when operating in the science-policy nexus. We aim to share good practices in modularization, automation, and reproducibility, and discuss strategies for efficient maintenance of scientific software development in context of long-term, policy-relevant monitoring projects.

Our system is designed to handle heterogeneous data with different levels of data protection requirements related to General Data Protection Regulation (GDPR). A modular structure was chosen to enhance usability and maintenance. Reproducibility is ensured through version-controlled, script-based software development. For efficiency, consistency and the streamlining of workflows reporting is automated and an ever-growing set of user-faced functions is bundled into a package. To ensure the possibility of advances in data preparation and modelling, a submission-based approach was chosen, recalculating all indicator times series each reporting year. This requires robust data management, reproducibility, and resilient workflows to accommodate evolving input data.

We still face challenges in handling Open Science principles, political stakeholder interests as well as GDPR. Similarly, scientific advances lead to updated results which may conflict with the need for clear and unambiguous outcomes of the authorities. Regular deadlines and stakeholder needs resulted in an organically grown code base, and sometimes cause neglection of quality checks and unit testing. Additionally, interaction between reproducible, script-based solutions and “traditional” workflows based on Microsoft Word are inefficient. The changing structure of the yearly gathered data hinders automatization of data processing. Due to this and the annual advances in the processing of the input data, maintaining the database is also challenging.  This we would like to share and discuss with other teams facing similar problem

Our system is tailored to handle heterogeneous and sensitive data of different sources producing reliable results and accommodating advances in data preparation and modelling in the long run. However, navigating technical limitations, good scientific practice and policymakers’ interests is challenging for us.

Literature

European Court of Justice (2018). European Commission against Federal Republic of Germany. Infringement Proceedings ‐ Directive 91/676/EEC.

European Environment Agency. (2024). Europe's state of water 2024: the need for improved water resilience. Publications Office.

Sutton, Mark A. (Ed.) (2011). The European nitrogen assessment. Sources, effects and policy perspectives. Cambridge 2011.

 

How to cite: Eysholdt, M., Zinnbauer, M., and Brandes, E.: User-turned-developer: Scientific software development for a national nutrient policy impact monitoring in Germany, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20382, https://doi.org/10.5194/egusphere-egu26-20382, 2026.

EGU26-21175 | Orals | ESSI3.4

A Python Dynamical Core for Numerical Weather Prediction 

Daniel Hupp, Mauro Bianco, Anurag Dipankar, Till Ehrengruber, Nicoletta Farabullini, Abishek Gopal, Enrique Gonzalez Paredes, Samuel Kellerhals, Xavier Lapillonne, Magdalena Luz, Christoph Müller, Carlos Osuna, Christina Schnadt, William Sawyer, Hannes Vogt, and Yilu Chen

MeteoSwiss uses the ICON model to produce high-resolution weather forecasts at kilometre scale, with GPU support enabled through an OpenACC-based Fortran implementation. While effective, this approach limits portability, maintainability, and development flexibility. Within the EXCLAIM project, we focus on the dynamical core of the model—responsible for approximately 55% of the total runtime—and explore alternatives based on a domain-specific Python framework. In particular, we reimplemented the computational stencils using GT4Py and integrated them into the existing Fortran codebase, enabling the partial replacement of key components. This hybrid approach aims to improve developer productivity and code adaptability while preserving performance. In this contribution, we present our strategy for developing software for a weather and climate model involving multiple institutions and stakeholders. We present several optimisation techniques and compare the performance of the new implementation with the original OpenACC version. Our results show improved computational efficiency alongside a substantial improvement in the development workflow. Finally, we discuss the practical challenges of integrating Python components into operational numerical weather prediction systems.

How to cite: Hupp, D., Bianco, M., Dipankar, A., Ehrengruber, T., Farabullini, N., Gopal, A., Gonzalez Paredes, E., Kellerhals, S., Lapillonne, X., Luz, M., Müller, C., Osuna, C., Schnadt, C., Sawyer, W., Vogt, H., and Chen, Y.: A Python Dynamical Core for Numerical Weather Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21175, https://doi.org/10.5194/egusphere-egu26-21175, 2026.

EGU26-21181 | ECS | Posters on site | ESSI3.4

Automating Data Quality Checks for Heterogenous Datasets: A scalable approach for IACS data 

Yi-Chen Pao and Boineelo Moyo

The Integrated Administration and Control System (IACS) is a key instrument of the European Union's (EU) Common Agricultural Policy to monitor agricultural subsidies and support evidence-based policy. IACS provides the most comprehensive EU-wide dataset that combines detailed geospatial data with thematic attributes related to land use, livestock and measures, making it highly valuable for research on agri-environmental policies and agrobiodiversity (Leonhardt, et.al., 2024). In Germany, these data are collected independently by 14 federal states, resulting in substantial heterogeneity across datasets in terms of file format, encoding, data structure and level of completeness. These inconsistencies present major challenges for efficient data management, scientific assessments, reproducibility and the long-term reuse of the data.

This contribution presents an ongoing automated framework designed to standardise and validate raw IACS datasets across our data management pipeline, from data collection and harmonisation to data import and long-term management. Our main goal is to reduce redundancy and manual effort in the data quality check process, while enabling scalable and reproducible data quality assurance. The objective is to therefore develop an optimised, non-redundant data check system that captures structural, semantic and geospatial metadata from heterogenous datasets using a single-pass folder scan. To achieve this objective, we focus on the following approaches:

  • Develop an inventory-based data pipeline / architecture: A lightweight inventory object containing metadata for each file in the delivery folder
  • Automate routine and error – prone data quality scripts: Replace manual checks with modular and reusable automated components from a central inventory system
  • Enable reproducible execution and reporting: Implement a Quarto based framework (an open-source system for reproducible computational documents combining code, results and narrative) that produces human readable visualisations for technical and non-technical users

Our system leverages a diverse set of programming tools including R, Quarto, Bash, Python and SQL, from data delivery or collection to data management in the database. The approach is based on an inventory-first architecture: a lightweight yet expressive data structure generated from a single scan of raw input folder with different types of data formats. The inventory then captures essential metadata of each file such as file types, attribute schemas, geospatial extents, and identifier patterns (e.g., farm identifier, land parcel identifier). A consolidated framework of all data check scripts then enables all subsequent quality-check modules to operate efficiently without repeated file access. Executing the consolidated framework performs a range of automated data quality checks such as file integrity verification, cross-file joinability analysis, schema consistency assessment, and geospatial coherence analysis.

The resulting output in the form of an interactive Quarto dashboard then provides a comprehensive first assessment of the delivered data, where all essential metadata and errors of each file can be derived and inspected in one instance. This workflow not only minimises manual work of checking each file separately and error propagation but also ensures traceable, documented logs.

Our results show how implementing such automated data checks considerably accelerates harmonization processes and improves the data management lifecycle.

How to cite: Pao, Y.-C. and Moyo, B.: Automating Data Quality Checks for Heterogenous Datasets: A scalable approach for IACS data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21181, https://doi.org/10.5194/egusphere-egu26-21181, 2026.

EGU26-21322 | Posters on site | ESSI3.4

SIrocco: a new workflow tool for Climate and Weather including explicit data representation and ICON support 

Matthieu Leclair, Julian Geiger, Alexander Goscinski, and Rico Häuselmann

With the increase in simulation resolution, climate and weather models are now potentially outputting petabytes of data. The largest projects can thus require complex workflows tightly integrating pre-processing, computing, post-processing, monitoring, potential downstream applications or archiving. We introduce here Sirocco, a new climate and weather workflow tool written in Python in collaboration between ETHZ, PSI and CSCS with a special care for the ICON model. 

Sirocco is written with separation of concerns in mind, where users should only care about expressing their desired workflow and bringing the scripts/sources for each task independently. That's why "Sirocco" first designates a user-friendly yaml based configuration format. Inspired by cylc and AiiDA, it describes the workflow graph by equally integrating data nodes (input and output) alongside task nodes. Workflows thus become truly composable, in the sense that no task is making any assumption on the behavior of others.

Sirocco currently defines two types of tasks, called "plugins". The "shell" plugin is dedicated to tasks for which users provide their own main executable, including any auxiliary set of files. The only requirement is the ability to interface with Sirocco, either with executables accepting command line arguments and environment variables and/or by parsing a yaml file providing the necessary context for task execution. The "icon" plugin is a dedicated user friendly interface to the ICON model. On top of the integration to Sirocco workflows, it provides easy ways of handling matters like date changing, namelist modifications, restart files or predefined setups for target machine and architecture. By design, other plugins can be written to facilitate the integration of any other application/model.

Once an internal representation is generated from the configuration file, two possible back-ends can orchestrate the workflow. The first one, called "stand-alone", is entirely implemented inside Sirocco and runs autonomously on the target machine, only relying on the HPC scheduler daemon to keep the workflow running. The second one interfaces with the low-level workflow library AiiDA and its satellite packages, running on a dedicated server with its own daemon and dumping workflow metadata in a queryable database. Both orchestrators implement the novel concept of a deep dynamical task front that propagates through the graph, enabling the ahead-of-time submission of an arbitrary number of task generations.

At the end of the day, Sirocco not only provides the ability to run complex workflows and a nice interface to ICON but also, through its workflow manager nature, facilitates shareability and reproducibility in the community.

How to cite: Leclair, M., Geiger, J., Goscinski, A., and Häuselmann, R.: SIrocco: a new workflow tool for Climate and Weather including explicit data representation and ICON support, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21322, https://doi.org/10.5194/egusphere-egu26-21322, 2026.

EGU26-21348 | Posters on site | ESSI3.4

CAES3AR: Collaborative and Efficient Scientific Software Support Architecture 

Florian Wagner, Camilla Lüttgens, Andrea Balza Morales, Marc S. Boxberg, Marcel Nellesen, and Marius Politze

Scientific software is essential for accelerating research and enabling transparent, reproducible results, but increasing adoption also increases support demands that can overwhelm small academic development teams. Since most scientists are not trained as software engineers, early-stage research software often lacks the resources and structure needed for broader use, making streamlined support workflows crucial for both users and developers. Addressing these issues is essential to ensure that researchers can focus on their core activities while streamlining processes that benefit both users and developers.

Our project CAES3AR (Collaborative and Efficient Scientific Software Support Architecture) aims to provide researchers with a more open and efficient infrastructure for software support by developing a collaborative architecture. The framework is currently being developed and evaluated using pyGIMLi, an open-source library for modeling and inversion in geophysics (www.pygimli.org), while being designed to remain transferable to a broad range of open-source projects. Thanks to its practicality and gallery of existing examples, pyGIMLi has become widely adopted in the near-surface geophysical community. At the same time, its use across diverse user environments introduces recurring support challenges, since variations in operating systems and installed dependencies can make issue reproduction and debugging time-intensive, which often reduces the capacity for methodological and software innovation.

To address these challenges efficiently, the CAES3AR framework aims to automate key aspects of user support through a generic toolchain that integrates seamlessly with existing infrastructures such as GitHub and Jupyter. It facilitates user engagement by allowing them to create GitHub or GitLab issues that include links to temporary code execution environments (e.g., JupyterLab) equipped with collaborative editing features—potentially integrated with existing JupyterHub and cloud-based infrastructures. Additionally, automated bots powered by GitHub Actions or GitLab jobs will provide real-time feedback on whether issues exist across all platforms and with the latest software versions. If a problem persists, supporters can directly modify the user's code within Jupyter without requiring any downloads or installations. Proposed changes will be presented as formatted code alterations (“diffs”) attributed to their authors in the Git issue for future reference, ensuring clarity and continuity even after the temporary JupyterHub instance is no longer available.

We recently hosted a community workshop to assess developer and user needs, identify challenges in current support practices, and gather requirements for practical adoption. This presentation summarizes key findings from those discussions and introduces early CAES3AR prototypes developed for the pyGIMLi ecosystem. As CAES3AR remains in active development, we conclude by inviting community feedback on additional features and design priorities, with the broader aim of ensuring transferability and long-term utility across multiple open-source scientific software projects.

Project website: https://caesar.pages.rwth-aachen.de/

 

How to cite: Wagner, F., Lüttgens, C., Balza Morales, A., Boxberg, M. S., Nellesen, M., and Politze, M.: CAES3AR: Collaborative and Efficient Scientific Software Support Architecture, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21348, https://doi.org/10.5194/egusphere-egu26-21348, 2026.

EGU26-23282 | Posters on site | ESSI3.4

Evolving Scientific Software in Long-Running Observatories: Lessons from the TERENO Sensor Management Migration 

Ulrich Loup, Werner Küpper, Christof Lorenz, Rainer Gasche, Ralf Kunkel, Ralf Gründling, Jannis Groh, Nils Brinckmann, Jan Bumberger, Marc Hanisch, Tobias Kuhnert, Rubankumar Moorthy, Florian Obersteiner, David Schäfer, and Thomas Schnicke

Abstract:

Scientific software in geosciences often grows organically: initial solutions
are developed within small teams to meet immediate research needs, and over time
they evolve into critical infrastructure. While this organic growth can be
highly effective, it frequently leads to challenges in maintainability,
documentation, and reuse when systems are expected to support larger communities
or integrate with new platforms. In this contribution, we share lessons learned
from evolving the software infrastructure of the TERENO environmental observatories.

For more than a decade, TERENO relied on tightly coupled systems in which
observational data and sensor metadata were managed together. This data
infrastructure proved robust in daily operations but gradually accumulated
inconsistencies, implicit conventions, and project-specific extensions that were
insufficiently documented. As TERENO is now being integrated into the Earth &
Environment DataHub, these limitations became visible and required a systematic
rethinking of how sensor and measurement metadata are managed.

As part of the infrastructure redesign within the Earth & Environment DataHub
initiative, we adopted the Helmholtz Sensor Management System (SMS), an open,
community-driven software platform. To support the transition, we developed and
extended the Python tool ODM2SMS, which enables reproducible and configurable
migration of metadata from the legacy system into SMS. This process exposed
several common pitfalls in scientific software development: hidden assumptions
in data structures, incomplete documentation, and software that worked well for
its original developers but was hard to adapt for new use cases.

We addressed these challenges by applying a set of pragmatic good practices.
These included increasing modularity and configurability in ODM2SMS, explicitly
documenting previously implicit rules, and combining automated migration steps
with manual review where scientific context was required. A particularly
instructive example is the migration of complex lysimeter installations,
involving hundreds of interconnected devices. This case highlighted the
importance of clear abstractions, shared terminology, and close interaction
between users and developers.

Our contribution reflects on how community engagement, open development, and
incremental refactoring can improve long-lived scientific software without
disrupting ongoing research. We conclude by discussing transferable lessons for
researchers facing similar challenges: balancing rapid development with
sustainability, making software usable beyond its original context, and turning
legacy systems into maintainable, future-ready tools.

How to cite: Loup, U., Küpper, W., Lorenz, C., Gasche, R., Kunkel, R., Gründling, R., Groh, J., Brinckmann, N., Bumberger, J., Hanisch, M., Kuhnert, T., Moorthy, R., Obersteiner, F., Schäfer, D., and Schnicke, T.: Evolving Scientific Software in Long-Running Observatories: Lessons from the TERENO Sensor Management Migration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23282, https://doi.org/10.5194/egusphere-egu26-23282, 2026.

EGU26-1897 | ECS | Posters on site | EOS4.4

The Unreliable Narrator: LSTM Internal States Fluctuate with Software Environments Despite Robust Predictions 

Ryosuke Nagumo, Ross Woods, and Miguel Rico-Ramirez

Since the robust performance of Long Short-Term Memory (LSTM) networks was established, their physics-awareness and interpretability have become central topics in hydrology. Seminal works (e.g., Lees et al. (2022)) have argued that LSTM internal states spontaneously capture hydrological concepts, and suggested that cell states can represent soil moisture dynamics despite not being explicitly trained on such data. Conversely, more recent studies (e.g., Fuente et al. (2024)) demonstrated that mathematical equifinality causes non-unique LSTM representations with different initialisations.

In this work, we report an arguably more systematic "bug" in the software environment that causes instability in internal states. We initially aimed to investigate how internal states behave differently when trained with or without historical observation data. We encountered this issue while reassembling a computational stack and attempting to replicate the initial results, as the original Docker environment was not preserved. While random seeds have been indicated to lead to different internal state trajectories, we found the computational backend (e.g., changing CUDA versions, PyTorch releases, or dependent libraries) also produces them. These are the findings:

  • In gauged catchments: Discharge predictions remained stable (in one catchment, NSE was 0.88 ± 0.01) across computational environments, yet the internal temporal variations (e.g., silhouette, mean, and std of cell states) fluctuated noticeably.
  • In pseudo-ungauged scenarios: The prediction performance itself became more reliant on the computational environment (in the same catchment, NSE dropped to 0.31 ± 0.15), yet the internal temporal variations of the cell states fluctuated only as much as they did during the gauged scenario.

These findings suggests that instability in the computational environment poses not only a risk of altering interpretability in training (by altering internal states) but also casts doubt on reliability in extrapolation (by altering outputs).

It is worth mentioning that we confirmed this is not a replicability issue; completely identical cell states and predictions are produced when the computational environment, seeds, and training data are held constant. We argue that such stability must be established as a standard benchmark before assigning physical meaning to deep learning internals.

How to cite: Nagumo, R., Woods, R., and Rico-Ramirez, M.: The Unreliable Narrator: LSTM Internal States Fluctuate with Software Environments Despite Robust Predictions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1897, https://doi.org/10.5194/egusphere-egu26-1897, 2026.

EGU26-2771 | Posters on site | EOS4.4

New EGU Manuscript Types: Limitations, Errors, Surprises, and Shortcomings as Opportunities for New Science (LESSONS) 

John Hillier, Ulrike Proske, Stefan Gaillard, Theresa Blume, and Eduardo Queiroz Alves

Moments or periods of struggle not only propel scientists forward, but sharing these experiences can also provide valuable lessons for others. Indeed, the current bias towards only publishing ‘positive’ results arguably impedes scientific progress as mistakes that are not learnt from are simply repeated. Here we present a new article type in EGU journals covering LESSONS learnt to help overcome this publishing bias. LESSONS articles describe the Limitations, Errors, Surprises, Shortcomings, and Opportunities for New Science emerging from the scientific process, including non-confirmatory and null results. Unforeseen complications in investigations, plausible methods that failed, and technical issues are also in scope. LESSONS thus fit the content of the BUGS session and can provide an outlet for articles based on session contributions. Importantly, a LESSONS Report will offer a substantial, valuable insight. LESSONS Reports are typically short (1,000-2,000 words) to help lower the barrier to journal publication, whilst LESSONS Posts (not peer-reviewed, but with a DOI on EGUsphere) can be as short as 500 words to allow early-stage reporting. LESSONS aim to destigmatise limitations, errors, surprises and shortcomings and to add these to the published literature as opportunities for new science – we invite you to share your LESSONS learnt.

 

Finally, a big thank you from this paper’s ‘core’ writing team to the wider group who have helped shape the LESSONS idea since EGU GA in 2025, including PubCom and in particular its Chair Barbara Ervens.

How to cite: Hillier, J., Proske, U., Gaillard, S., Blume, T., and Queiroz Alves, E.: New EGU Manuscript Types: Limitations, Errors, Surprises, and Shortcomings as Opportunities for New Science (LESSONS), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2771, https://doi.org/10.5194/egusphere-egu26-2771, 2026.

EGU26-3077 | ECS | Posters on site | EOS4.4

False Starts and Silver Linings: A Photocatalytic Journey with Layered Double Hydroxides 

Anna Jędras and Jakub Matusik

Photocatalysis is frequently presented in the literature as a straightforward route toward efficient degradation of pollutants, provided that the “right” material is selected. Layered double hydroxides (LDH) are often highlighted as promising photocatalysts due to their tunable composition and reported activity in dye degradation. Motivated by these claims, this study evaluated LDH as mineral analogs for photocatalytic water treatment, ultimately uncovering a series of unexpected limitations, methodological pitfalls, and productive surprises.

In the first stage, Zn/Cr, Co/Cr, Cu/Cr, and Ni/Cr LDHs were synthesized and tested for photocatalytic degradation of methylene blue (0.02 mM) and Acid Blue Dye 129 (0.3 mM). Contrary to expectations,1 photocatalytic performance was consistently low. After one hour of irradiation, concentration losses attributable to photocatalysis did not exceed 15%, while most dye removal resulted from adsorption. Despite extensive efforts to optimize synthesis protocols, catalyst composition, and experimental conditions, this discrepancy with previously published studies could not be resolved.

To overcome limitations related to particle dispersion, surface accessibility, and charge-carrier separation, a second strategy was pursued by incorporating clay minerals as supports.2 Zn/Cr LDH, identified as the most active composition in preliminary tests, was coprecipitated with kaolinite, halloysite, and montmorillonite. Experiments with methylene blue (0.1 mM) and Acid Blue 129 (0.3 mM) demonstrated enhanced adsorption capacities. However, photocatalytic degradation efficiencies remained poor, typically below 10% after one hour, indicating that apparent performance gains were largely adsorption-driven rather than photochemical.

This failure proved to be a turning point. Instead of abandoning LDH entirely, they were combined with graphitic carbon nitride (GCN) to form a heterostructure.3 This approach resulted in a dramatic improvement: after optimization of the synthesis protocol, 99.5% of 1 ppm estrone was degraded within one hour.4 Further modifications were explored by introducing Cu, Fe, and Ag into the LDH/GCN system. While Cu and Fe suppressed photocatalytic activity, silver, at an optimized loading, reduced estrone concentrations below the detection limit within 40 minutes.5

This contribution presents a full experimental arc - from promising hypotheses that failed, through misleading adsorption-driven “successes,” to an ultimately effective but non-intuitive solution - highlighting the value of negative results and surprises as drivers of scientific progress.

This research was funded by the AGH University of Krakow, grant number 16.16.140.315.

Literature:

1            N. Baliarsingh, K. M. Parida and G. C. Pradhan, Ind. Eng. Chem. Res., 2014, 53, 3834–3841.

2            A. Í. S. Morais, W. V. Oliveira, V. V. De Oliveira, L. M. C. Honorio, F. P. Araujo, R. D. S. Bezerra, P. B. A. Fechine, B. C. Viana, M. B. Furtini,
              E. C. Silva-Filho and J. A. Osajima, Journal of Environmental Chemical Engineering, 2019, 7, 103431.

3            B. Song, Z. Zeng, G. Zeng, J. Gong, R. Xiao, S. Ye, M. Chen, C. Lai, P. Xu and X. Tang, Advances in Colloid and Interface Science, 2019, 272, 101999.

4            A. Jędras, J. Matusik, E. Dhanaraman, Y.-P. Fu and G. Cempura, Langmuir, 2024, 40, 18163–18175.

5            A. Jędras, J. Matusik, J. Kuncewicz and K. Sobańska, Catal. Sci. Technol., 2025, 15, 6792–6804.

How to cite: Jędras, A. and Matusik, J.: False Starts and Silver Linings: A Photocatalytic Journey with Layered Double Hydroxides, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3077, https://doi.org/10.5194/egusphere-egu26-3077, 2026.

EGU26-4074 | Orals | EOS4.4

Instructive surprises in the hydrological functioning of landscapes 

James Kirchner, Paolo Benettin, and Ilja van Meerveld

BUGS can arise in individual research projects, but also at the level of communities of researchers, leading to shifts in the scientific consensus.  These community-level BUGS typically arise from observations that are surprising to (or previously overlooked by) substantial fractions of the research community.  In this presentation, we summarize several community-level BUGS in our field: specifically, key surprises that have transformed the hydrological community's understanding of hillslope and catchment processes in recent decades.  

Here are some examples.  (1) Students used to learn (and some still do today) that storm runoff is dominated by overland flow.  But stable isotope tracers have convincingly shown instead that even during storm peaks, streamflow is composed mostly of water that has been stored in the landscape for weeks, months, or years.  (2) Maps, and most hydrological theories, have typically depicted streams as fixed features of the landscape.  But field mapping studies have shown that stream networks are surprisingly dynamic, with up to 80% of stream channels going dry sometime during the year.  (3) Textbooks have traditionally represented catchment storage as a well-mixed box.  But tracer time series show fractal scaling that cannot be generated by well-mixed boxes, forcing a re-think of our conceptualization of subsurface storage and mixing.  (4) Waters stored in aquifers, and the waters that drain from them, have traditionally been assumed to share the same age.  But tracers show that waters draining from aquifers are often much younger than the groundwaters that are left behind, and this was subsequently shown to be an inevitable result of aquifer heterogeneity. 

Several examples like these, and their implications, will be briefly discussed, with an eye to the question: how can we maximize the chances for future instructive surprises?

How to cite: Kirchner, J., Benettin, P., and van Meerveld, I.: Instructive surprises in the hydrological functioning of landscapes, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4074, https://doi.org/10.5194/egusphere-egu26-4074, 2026.

Coming from geosciences, we hopefully know what we want to do. Coming from numerics, however, we often know quite well what we are able to do and look for a way to sell it to the community. A few years ago, deep-learning techniques brought new life into the glaciology community. These approaches  allowed for simulations of glacier dynamics at an unprecedented computational performance and motivated several researchers to tackle the numerous open questions about past and present glacier dynamics, particularly in alpine regions. From another point of view, however, it was also tempting to demonstrate that the human brain is still more powerful than artificial intelligence by developing a new classical numerical scheme that can compete with deep-learning techniques concerning its efficiency.

Starting point was, of course, the simplest approximation to the full 3-D Stokes equations, the so-called shallow ice approximation (SIA). Progress was fast and the numerical performance was even better than expected. The new numerical scheme enabled simulations with spatial resolutions of 25 m on a desktop PC, while previous schemes did not reach simulations below a few hundred meters.

However, the enthusiasm pushed the known limitations of the SIA a bit out of sight. Physically, the approximation is quite bad on rugged terrain, particularly in narrow valleys. So the previous computational limitations have been replaced by physical limitations since high resolutions are particularly useful for rugged topographies. In other words, a shabby house has a really good roof now.

What are the options in such a situation?

  • Accept that there is no free lunch and avoid contact to the glacialogy community in the future.
  • Continue the endless discussion about the reviewers' opinion that a spatial resolution of 1 km is better than 25 m.
  • Find a real-world data set that matches the results of the model and helps to talk the problems away.
  • Keep the roof and build a new house beneath. Practically, this would be developing a new approximation to the full 3-D Stokes equations that is compatible to the numerical scheme and reaches an accuracy similar to those of the existing approximations.
  • Take the roof and put it on one of the existing solid houses. Practically, this would be an extension of the numerical scheme towards more complicated systems of differential equations. Unfortunately, efficient numerical schemes are typically very specific. So the roof will not fit easily and it might leak.

The story is open-ended, but there will be at least a preliminary answer in the presentation.

 

How to cite: Hergarten, S.: How useful is a new roof on a shabby house? An example from glacier modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4196, https://doi.org/10.5194/egusphere-egu26-4196, 2026.

EGU26-4587 | Posters on site | EOS4.4

The importance of describing simple methods in climate sensitivity literature 

Anna Zehrung, Andrew King, Zebedee Nicholls, Mark Zelinka, and Malte Meinshausen

“Show your working!” – is the universal phrase drilled into science and maths students to show a clear demonstration of the steps and thought processes used to reach a solution (and to be awarded full marks on the exam). 

Beyond the classroom, “show your working” becomes the methods section on every scientific paper, and is critical for the transparency and replicability of the study. However, what happens if parts of the method are considered assumed knowledge, or cut in the interests of a word count? 

An inability to fully replicate the results of a study became the unexpected glitch at the start of my PhD. Eager to familiarise myself with global climate model datasets, I set out to replicate the results of a widely cited paper which calculates the equilibrium climate sensitivity (ECS) across 27 climate models. The ECS is the theoretical global mean temperature response to a doubling of atmospheric CO2 relative to preindustrial levels. A commonly used method to calculate the ECS is to apply an ordinary least squares regression to global annual mean temperature and radiative flux anomalies. 

Despite the simplicity of a linear regression between two variables, we obtained ECS estimates for some climate models that differed from those reported in the original study, even though we followed the described methodology. However, the methodology provided only limited detail on how the raw climate model output – available at regional and monthly scales – was processed to obtain global annual mean anomalies. Differences in these intermediate processing steps can, in turn, lead to differences in ECS estimates.

Limited reporting of data-processing steps is common in the ECS literature. Whether these steps are considered assumed knowledge or deemed too simple to warrant explicit description, we demonstrate that, for some models, they can materially affect the resulting ECS estimate. While the primary aim of our study is to recommend a standardised data-processing pathway for ECS calculations, a secondary aim is to highlight the lack of transparency in key methodological details across the literature. A central takeaway is the importance of clearly documenting all processing steps – effectively, to “show your working” – and to emphasise the critical role of a detailed methods section.

How to cite: Zehrung, A., King, A., Nicholls, Z., Zelinka, M., and Meinshausen, M.: The importance of describing simple methods in climate sensitivity literature, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4587, https://doi.org/10.5194/egusphere-egu26-4587, 2026.

Observation of atmospheric constituents and processes is not easy. As atmospheric chemists, we use sensitive equipment, for example mass spectrometers, that we often set up in a (remote) location or on a moving platform for a few-weeks campaign to make in-situ observations. All this with the goal of explaining more and more atmospheric processes, and to verify and improve atmospheric models. However, glitches can happen anywhere in an experiment, be it in the experimental design, setup, or instrumental performance. Thus, complete data coverage during such a campaign is not always a given, resulting in gaps in (published) datasets. And the issue with air is that you can never go back and measure the exact same air again. Here, I would like to share some stories behind such gaps, and what we learned from them. This presentation aims to encourage early career researchers who might be struggling with feelings of failure when bugs, blunders and glitches happen in their experiments - you are not alone! I will share what we learned from these setbacks and how each of them improved our experimental approaches.

How to cite: Pfannerstill, E. Y.: Why are there gaps in your measurements? Sharing the stories behind the missing datapoints, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5494, https://doi.org/10.5194/egusphere-egu26-5494, 2026.

Over a 24-year research period, three successive experimental investigations led to three publications, each of which falsified the author’s preceding hypothesis and proposed a revised conceptual framework. Despite an initial confidence in having identified definitive solutions, subsequent experimental evidence consistently demonstrated the limitations and inaccuracies of earlier interpretations. This iterative process ultimately revealed that samples, in particular geological reference materials, sharing identical petrographic or mineralogical descriptions are not necessarily chemically equivalent and can exhibit markedly different behaviors during chemical digestion procedures. These findings underscore the critical importance of continuous hypothesis testing, self-falsification, and experimental verification in scientific research, particularly when working with reference materials assumed to be identical. I will be presenting data on the analysis of platinum group elements (PGE) and osmium isotopes in geological reference materials (chromitites, ultramafic rocks and basalts), which demonstrates the need for challenging matrices for method validation. 

How to cite: Meisel, T. C.: Self-falsification as a driver of scientific progress: Insights from long-term experimental research, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5771, https://doi.org/10.5194/egusphere-egu26-5771, 2026.

EGU26-6794 | ECS | Orals | EOS4.4

Back to square one (again and again): Finding a bug in a complex global atmospheric model   

Nadja Omanovic, Sylvaine Ferrachat, and Ulrike Lohmann

In atmospheric sciences, a central tool to test hypotheses are numerical models, which aim to represent (part of) our environment. One such model is the weather and climate model ICON [1], which solves the Navier-Stokes equation for capturing the dynamics and parameterizes subgrid-scale processes, such as radiation, cloud microphysics, and aerosol processes. Specifically, for the latter exists the so-called Hamburg Aerosol Module (HAM [2]), which is coupled to ICON [3] and predicts the evolution of aerosol populations using two moments (mass mixing ratio and number concentration). The high complexity of aerosols is reflected in the number of aerosol species (total of 5), number of modes (total of 4), and their mixing state and solubility. The module calculates aerosol composition and number concentration, their optical properties, their sources and sinks, and their interactions with clouds via microphysical processes. Aerosol emissions are sector-specific and based on global emission inventories or dynamically computed.

Within our work, we stumbled upon an interesting pattern occurrence in our simulations upon changing/turning off single emission sectors. If we, e.g., removed black carbon from aircraft emissions, the strongest changes emerged over the African continent, which is not the region where we were expecting to see the strongest response. Further investigations revealed that this pattern emerges independently of the emission sector as well as species, confirming our suspicion that we are facing a bug within HAM. Here, we want to present how we approached the challenge of identifying and tackling a bug within a complex module with several thousand lines of code.

 

[1] G. Zängl, D. Reinert, P. Ripodas, and M. Baldauf, “The ICON (ICOsahedral Non-hydrostatic) modelling framework of DWD and MPI-M: Description of the non-hydrostatic dynamical core,” Quarterly Journal of the Royal Meteorological Society, vol. 141, no. 687, pp. 563–579, 2015, ISSN: 1477-870X. DOI: 10.1002/qj.2378

[2] P. Stier, J. Feichter, S. Kinne, S. Kloster, E. Vignati, J. Wilson, L. Ganzeveld, I. Tegen, M. Werner, Y. Balkanski, M. Schulz, O. Boucher, A. Minikin, and A. Petzold, “The aerosol-climate model ECHAM5-HAM,” Atmospheric Chemistry and Physics, 2005. DOI: 10.5194/acp-5-1125-2005

[3] M. Salzmann, S. Ferrachat, C. Tully, S. M¨ unch, D. Watson-Parris, D. Neubauer, C. Siegenthaler-Le Drian, S. Rast, B. Heinold, T. Crueger, R. Brokopf, J. Mülmenstädt, J. Quaas, H. Wan, K. Zhang, U. Lohmann, P. Stier, and I. Tegen, “The Global Atmosphere-aerosol Model ICON-A-HAM2.3–Initial Model Evaluation and Effects of Radiation Balance Tuning on Aerosol Optical Thickness,” Journal of Advances in Modeling Earth Systems, vol. 14, no. 4,e2021MS002699, 2022, ISSN: 1942-2466. DOI: 10.1029/2021MS002699

How to cite: Omanovic, N., Ferrachat, S., and Lohmann, U.: Back to square one (again and again): Finding a bug in a complex global atmospheric model  , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6794, https://doi.org/10.5194/egusphere-egu26-6794, 2026.

In situ cloud measurements are essential for understanding atmospheric processes and establishing a reliable ground truth. Obtaining these data is rarely straightforward. Challenges range from accessing clouds in the first place to ensuring that the instrument or environment does not bias the sample. This contribution explores several blunders and unexpected glitches encountered over fifteen years of field campaigns.

I will share stories of mountain top observations where blowing snow was measured instead of cloud ice crystals and the ambitious but failed attempt to use motorized paragliders for sampling. I also reflect on winter campaigns where the primary obstacles were flooding and mud rather than cold and snow. While these experiences were often frustrating, they frequently yielded useful data or led to new insights. One such example is the realization that drone icing is not just a crash risk but can also serve as a method for measuring liquid water content. By highlighting these setbacks and the successful data that emerged despite them, I aim to foster a discussion on the value of trial and error and persistence in atmospheric physics.

How to cite: Henneberger, J.: How Not to Measure a Cloud: Lessons from Fifteen Years of Fieldwork Failures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8228, https://doi.org/10.5194/egusphere-egu26-8228, 2026.

EGU26-8359 | ECS | Posters on site | EOS4.4

Do trees save lives under climate change? It’s complicated  

Nils Hohmuth, Nora L. S. Fahrenbach (presenting), Yibiao Zou (presenting), Josephine Reek, Felix Specker, Tom Crowther, and Constantin M. Zohner

Forests are powerful climate regulators: Their CO2 uptake provides a global biogeochemical cooling effect, and in the tropics, this cooling is further strengthened by evapotranspiration. Given that temperature-related mortality is a relevant global health burden, which is expected to increase under climate change, we set out to test what we thought was a promising hypothesis: Can forests reduce human temperature-related mortality from climate change? 

To test this, we used simulated temperature changes to reforestation from six different Earth System Models (ESMs) under a future high-emission scenario, and paired them with age-specific population data and three methodologically different temperature-mortality frameworks (Cromar et al. 2022, Lee et al. 2019, and Carleton et al. 2022). We expected to find a plausible range of temperature-related mortality outcomes attributable to global future forests conservation efforts.

Instead, our idea ran head-first into a messy reality. Firstly, rather than showing a clear consensus, the ESMs produced a wide range of temperature responses to reforestation, varying both in magnitude and sign. This is likely due to the albedo effect, varying climatological tree cover and land use processes implemented by the models, in addition to internal variability which we could not reduce due to the existence of only one ensemble member per model. Consequently, the models disagreed in many regions on whether global forest conservation and reforestation would increase or decrease temperature by the end of the century.

The uncertainties deepened when we incorporated the mortality data. Mortality estimates varied by up to a factor of 10 depending on the ESM and mortality framework used. Therefore, in the end, the models could not even agree on whether forests increased or decreased temperature-related mortality. We found ourselves with a pipeline that amplified uncertainties of both the ESM and mortality datasets.

For now, the question remains wide open: Do trees save us from temperature-related deaths in a warming world, and if so, by how much?

 

* The first two authors contributed equally to this work.

How to cite: Hohmuth, N., Fahrenbach (presenting), N. L. S., Zou (presenting), Y., Reek, J., Specker, F., Crowther, T., and Zohner, C. M.: Do trees save lives under climate change? It’s complicated , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8359, https://doi.org/10.5194/egusphere-egu26-8359, 2026.

EGU26-10401 | ECS | Orals | EOS4.4

The empty mine: Why better tools do not help you find new diamonds 

Ralf Loritz, Alexander Dolich, and Benedikt Heudorfer

Hydrological modelling has long been shaped by a steady drive toward ever more sophisticated models. In the era of machine learning, this race has turned into a relentless pursuit of complexity: deeper networks and ever more elaborate architectures that often feel outdated by the time the ink on the paper is dry. Motivated by a genuine belief in methodological progress, I, like many others, spent considerable effort exploring this direction, driven by the assumption that finding the “right” architecture or model would inevitably lead to better performance. This talk is a reflection on that journey; you could say my own Leidensweg. Over several years, together with excellent collaborators, I explored a wide range of state-of-the-art deep-learning approaches for rainfall–runoff modelling and other hydrological modelling challenges. Yet, regardless of the architecture or training strategy, I repeatedly encountered the same performance ceiling. In parallel, the literature appeared to tell a different story, with “new” models regularly claiming improvements over established baselines. A closer inspection, however, revealed that rigorous and standardized benchmarking is far from common practice in hydrology, making it difficult to disentangle genuine progress from artefacts of experimental design. What initially felt like a failure to improve my models turned out to be a confrontation with reality. The limiting factor was not the architecture, but the problem itself. We have reached a point where predictive skill is increasingly bounded by the information content of our benchmark datasets and maybe more importantly by the way we frame our modelling challenges, rather than by model design. Like many others, I have come to believe that if we want to move beyond the current performance plateau, the next breakthroughs are unlikely to come from ever more complex models alone. Instead, as a community, we need well-designed model challenges, better benchmarks, and datasets that meaningfully expand the information available to our models to make model comparisons more informative.

How to cite: Loritz, R., Dolich, A., and Heudorfer, B.: The empty mine: Why better tools do not help you find new diamonds, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10401, https://doi.org/10.5194/egusphere-egu26-10401, 2026.

EGU26-13630 | ECS | Orals | EOS4.4

How NOT to identify streamflow events? 

Larisa Tarasova and Paul Astagneau

Examining catchment response to precipitation at event scale is useful for understanding how various hydrological systems store and release water. Many of such event scale characteristics, for example event runoff coefficient and event time scale are also important engineering metrics used for design. However, deriving these characteristics requires identification of discrete precipitation-streamflow events from continuous hydrometeorological time series.

Event identification is not at all a trivial task. It becomes even more challenging when working with very large datasets that encompass a wide range of spatial and temporal dynamics. Approaches range from visual expert judgement to baseflow-separation-based methods and objective methods based on the coupled dynamics of precipitation and streamflow. Here, we would like to present our experience in the quest to devise the “ideal” method for large datasets – and trust us, we tried, a lot. We demonstrate that expert-based methods can be seriously flawed simply by changing a few meta parameters, such as the length of displayed periods, baseflow-separation-based methods deliver completely opposite results when different underlying separation methods are selected, and objective methods suddenly fail when dynamics with different temporal scales are simultaneously present.

Ultimately, we realized that finding a one-size-fits-all method was not possible and that compromises had to be made to select sufficiently representative events across large datasets. Therefore, we advocate for pragmatic case-specific evaluation criteria and for transparency in event identification to make study results reproducible and fit for purpose, if not perfect.

How to cite: Tarasova, L. and Astagneau, P.: How NOT to identify streamflow events?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13630, https://doi.org/10.5194/egusphere-egu26-13630, 2026.

EGU26-14148 | Orals | EOS4.4 | Highlight

Buggy benefits of more fundamental climate models 

Bjorn Stevens, Marco Giorgetta, and Hans Segura

A defining attribute of global-storm resolving models is that modelling is replaced by simulation.  In addition to overloading the word “model”  this avails the developer of a much larger variety of tests, and brings about a richer interplay with their intuition.  This has proven helpful in identifying and correcting many mistakes in global-storm resolving models that traditional climate models find difficult to identify, and usually compensate by “tuning.”  It also means that storm-resolving models are built and tested in a fundamentally different way than are traditional climate models. In this talk I will review the development of ICON as a global storm resolving model to illustrate how this feature, of trying to simulate rather than model the climate system, has helped identify a large number of long-standing bugs in code bases inherited from traditional models; how this can support open development; and how sometimes these advantages also prove to be buggy.

How to cite: Stevens, B., Giorgetta, M., and Segura, H.: Buggy benefits of more fundamental climate models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14148, https://doi.org/10.5194/egusphere-egu26-14148, 2026.

EGU26-14374 | Orals | EOS4.4

The dangerous temptation of optimality in hydrological and water resources modelling 

Thorsten Wagener and Francesca Pianosi

Hydrological and water systems modelling has long been driven by the search for better models. We do so by searching for models or at least parameter combinations that provide the best fit to given observations. We ourselves have contributed to this effort by developing new methods and by publishing diverse case studies. However, we repeatedly find that searching for and finding an optimal model is highly fraught in the presence of unclear signal-to-noise ratios in our observations, of incomplete models and of highly imbalanced databases. We present examples of our own work through which we have realized that achieving optimality was possible but futile unless we give equal consideration to issues of consistency, robustness and problem framing. We argue here that the strong focus on optimality continues to be a hindrance for advancing hydrologic science and for transferring research achievements into practice – probably more so than in other areas of the geosciences.

How to cite: Wagener, T. and Pianosi, F.: The dangerous temptation of optimality in hydrological and water resources modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14374, https://doi.org/10.5194/egusphere-egu26-14374, 2026.

Among soil physical analyses, determination of the soil particle-size distribution (PSD) is arguably the most fundamental. The standard methodology combines sieve analysis for sand fractions with sedimentation-based techniques for silt and clay. Established sedimentation methods include the pipette and hydrometer techniques. More recently, the Integral Suspension Pressure (ISP) method has become available, which derives PSD by inverse modeling of the temporal evolution of suspension pressure measured at a fixed depth in a sedimentation cylinder. Since ISP is based on the same physical principles as the pipette and hydrometer methods, their results should, in principle, agree.

The ISP methodology has been implemented in the commercial instrument PARIO (METER Group, Munich). While elegant, the method relies on pressure change measurements with a resolution of 0.1 Pa (equivalent to 0.01 mm of water column). Consequently, the PARIO manual strongly advises avoiding any mechanical disturbance such as thumping, bumping, clapping, vibration, or other shock events. This warning is essentially precautionary, because to date no systematic experimental investigation of such disturbances has been reported.

To explore this issue, we prepared a single 30 g soil sample following standard PSD procedures and subjected it to 26 PARIO repeated measurement runs over a period of five months, each run lasting 12 h. Between runs, the suspension was remixed but otherwise not altered. The first ten runs (over ten days) were conducted without intentional disturbance to establish baseline repeatability. This was followed by eight runs with deliberately imposed and timed disturbances that generated single or repeated vibrations (“rocking and shocking”). After approximately two and five months, we conducted additional sets of five and three undisturbed runs, respectively.

We report how these mechanical disturbances, along with temperature variations during measurement and the time elapsed since sample pre-treatment, affected the derived PSD. The results provide a first quantitative assessment of how fragile—or robust—the ISP method and PARIO system really are when reality refuses to sit perfectly still.

 

How to cite: Nemes, A. and Durner, W.: Rocking and Shocking the PARIOTM: How Sensitive Is ISP-Based Particle-Size Analysis to Mechanical Disturbance?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14763, https://doi.org/10.5194/egusphere-egu26-14763, 2026.

EGU26-14852 | Posters on site | EOS4.4

Some Norwegian soils behave differently: is it an inheritance from marine sedimentation? 

Attila Nemes, Pietro Bazzocchi, Sinja Weiland, and Martine van der Ploeg

Predicting soil hydraulic behavior is necessary for the modeling of catchments and agricultural planning, particularly for a country like Norway where only 3% of land is suitable for farming. Soil texture is an important and easily accessible parameter for the prediction of soil hydraulic behavior. However, some Norwegian farmland soils, which formed as glacio-marine sediments and are characterized by a medium texture, have shown the hydraulic behavior of heavy textured soils. Coined by the theory behind well-established sedimentation-enhancing technology used in waste water treatment, we hypothesized that sedimentation under marine conditions may result in specific particle sorting and as a result specific pore system characteristics. To test this, we designed four custom-built devices to produce artificially re-sedimented columns of soil material to help characterize the influence of sedimentation conditions. We successfully produced column samples of the same homogeneous mixture of fine-sand, silt, and clay particles obtained by physically crushing and sieving (< 200 µm) subsoil material collected at the Skuterud catchment in South-East Norway, differing only in sedimentation conditions (deionized water vs 35 g per liter NaCl solution). Then, the inability of standard laboratory methods to measure the saturated hydraulic conductivity of such fine material, led us to “MacGyver” (design and custom-build) two alternative methodologies to measure that property, i.e. i) by adapting a pressure plate extractor for a constant head measurement and ii) by building a 10 m tall pipe-system in a common open area of the office, in order to increase the hydraulic head on the samples. There was a learning curve with both of those methods, but we have found that the salt-water re-sedimented columns were about five times more permeable than the freshwater ones, which was the complete opposite of our expectations. However, an unexpected blunder in the conservation of our samples suggests that our hypothesis should be further explored rather than dismissed. These contributions hint about the mechanisms that may underlie the anomalous hydraulic behaviour of certain Norwegian soils and raise new questions on the formation of marine clays, improving knowledge available for land managers and modellers.

 

How to cite: Nemes, A., Bazzocchi, P., Weiland, S., and van der Ploeg, M.: Some Norwegian soils behave differently: is it an inheritance from marine sedimentation?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14852, https://doi.org/10.5194/egusphere-egu26-14852, 2026.

EGU26-16619 | Orals | EOS4.4

The unknown knowns – the inconvenient knowledge in hydrogeology we do not like to use 

Okke Batelaan, Joost Herweijer, Steven Young, and Phil Hayes

“It is in the tentative stage that the affections enter with their blinding influence. Love was long since represented as blind…The moment one has offered an original explanation for a phenomenon which seems satisfactory, that moment affection for his intellectual child springs into existence…To guard against this, the method of multiple working hypotheses is urged. … The effort is to bring up into view every rational explanation of new phenomena, and to develop every tenable hypothesis respecting their cause and history. The investigator thus becomes the parent of a family of hypothesis: and, by his parental relation to all, he is forbidden to fasten his affections unduly upon any one” (Chamberlin, 1890).

The MADE (macro-dispersion) natural-gradient tracer field experiments were conducted more than 35 years ago. It aimed to determine field-scale dispersion parameters based on detailed hydraulic conductivity measurements to support transport simulation. A decade of field experiments produced a 30-year paper trail of modelling studies with no clear resolution of a successful simulation approach for practical use in transport problems.  As a result, accurately simulating contaminant transport in the subsurface remains a formidable challenge in hydrogeology.

What went awry, and why do we often miss the mark?

Herweijer et al. (2026) conducted a ‘back to basics’ review of the original MADE reports and concluded that there are significant inconvenient and unexplored issues that influenced the migration of the tracer plume and or biased observations. These issues include unreliable measurement of hydraulic conductivity, biased tracer concentrations, and underestimation of sedimentological heterogeneity and non-stationarity of the flow field. Many studies simulating the tracer plumes appeared to have ignored, sidestepped, or been unaware of these issues, raising doubts about the validity of the results.

Our analysis shows that there is a persistent drive among researchers to conceptually oversimplify natural complexity to enable testing of single-method modelling, mostly driven by parametric stochastic approaches. Researchers tend to be anchored to a specialised, numerically driven methodology and have difficulty in unearthing highly relevant information from ‘unknown known’ data or applying approaches outside their own specialised scientific sub-discipline. Another important aspect of these ‘unkowns knowns’ is the tendency to accept published data verbatim. Too often, there is no rigorous investigation of the original measurement methods and reporting, and, if need be, additional testing to examine the root cause of data issues.

Following the good old advice of Chamberlin (1890), we used a knowledge framework to systematically assess knowns, unknowns, and associated confidence levels, yielding a set of multi-conceptual models. Based on identified 'unknowns', these multi-models can be tested against reliable 'knowns' such as piezometric data and mass balance calculations.  

Chamberlin, T.C., 1890, The method of multiple working hypotheses. Science 15(366): 92-96. doi:10.1126/science.ns-15.366.92.

Herweijer J.C., S. C Young, P. Hayes, and O. Batelaan, 2026, A multi-conceptual model approach to untangling the MADE experiment, Accepted for Publication in Groundwater.

How to cite: Batelaan, O., Herweijer, J., Young, S., and Hayes, P.: The unknown knowns – the inconvenient knowledge in hydrogeology we do not like to use, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16619, https://doi.org/10.5194/egusphere-egu26-16619, 2026.

EGU26-17373 | Posters on site | EOS4.4

The Hidden Propagator: How Free-Slip Boundaries Corrupt 3D Simulations 

Laetitia Le Pourhiet

Free-slip boundary conditions are routinely used in 3D geodynamic modelling because they reduce computational cost, avoid artificial shear zones at domain edges, and simplify the implementation of large-scale kinematic forcing. However, despite their apparent neutrality, our experiments show that free-slip boundaries systematically generate first-order artefacts that propagate deep into the model interior and can severely distort the interpretation of continental rifting simulations.

Here we present a set of 3D visco-plastic models inspired by the South China Sea (SCS) that were originally designed to study the effect of steady-state thermal inheritance and pluton-controlled crustal weakening. Unexpectedly, in all simulations except those with a very particular inverted rheological profile (POLC), the free-slip boundary on the “Vietnam side” of the domain generated a persistent secondary propagator, producing unrealistic amounts of lithospheric thinning in the southwest corner. This artefact appeared irrespective of crustal rheology, seeding strategy, or the presence of thermal heterogeneities.

We identify three systematic behaviours induced by free-slip boundaries in 3D:
(1) forced rift nucleation at boundary-adjacent thermal gradients,
(2) artificial propagator formation that competes with the intended first-order rifting, and
(3) rotation or shearing of micro-blocks not predicted by tectonic reconstructions.

These artefacts originate from the inability of free-slip boundaries to transmit shear traction, which artificially channels deformation parallel to the boundary when lateral thermal or mechanical contrasts exist. In 3D, unlike in 2D, the combination of oblique extension and boundary-parallel velocity freedom leads to emergent pseudo-transform behaviour that is entirely numerical.

Our results highlight a key negative outcome: free-slip boundaries cannot be assumed neutral in 3D rift models, especially when studying localisation, obliquity, multi-propagator dynamics, or the competition between structural and thermal inheritance. We argue that many published 3D rift models may unknowingly include such artefacts.

 

How to cite: Le Pourhiet, L.: The Hidden Propagator: How Free-Slip Boundaries Corrupt 3D Simulations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17373, https://doi.org/10.5194/egusphere-egu26-17373, 2026.

EGU26-18600 | Posters on site | EOS4.4

Data Disaster to Data Resilience: Lessons from CEDA’s Data Recovery  

Edward Williamson, Matt Pritchard, Alan Iwi, Sam Pepler, and Graham Parton

On 18 November 2025, a small error during internal data migration of between storage systems of the JASMIN data analysis platform in the UK led to a substantial part of the CEDA Archive being made temporarily unavailable online (but not lost!). The unfortunate incident caused serious disruption to a large community of users (and additional workload and stress for the team), it provided important learning points for the team in terms of:  

  • enhancing data security,  
  • importance of mutual support among professional colleagues,  
  • the value of clear and transparent communications with your users 
  • a unique opportunity to showcase the capabilities of a cutting-edge digital research infrastructure in the recovery and return to service with this “unscheduled disaster recovery exercise”. 

 

We report on the circumstances leading to the incident, the lessons learned, and the technical capabilities employed in the recovery. One example shows, nearly 800 Terabytes of data transferred from a partner institution in the USA in just over 27 hours, at a rate of over 8 Gigabytes per second using Globus. The ability to orchestrate such a transfer is the result of many years of international collaboration to support large-scale environmental science, and highlights the benefits of a federated, replicated data infrastructure built on well-engineered technologies.

How to cite: Williamson, E., Pritchard, M., Iwi, A., Pepler, S., and Parton, G.: Data Disaster to Data Resilience: Lessons from CEDA’s Data Recovery , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18600, https://doi.org/10.5194/egusphere-egu26-18600, 2026.

EGU26-19755 | ECS | Posters on site | EOS4.4

Opposite cloud responses to extreme Arctic pollution: sensitivity to cloud microphysics, or a bug? 

Rémy Lapere, Ruth Price, Louis Marelle, Lucas Bastien, and Jennie Thomas

Aerosol-cloud interactions remain one of the largest uncertainties in global climate modelling. This uncertainty arises because of the dependence of aerosol-cloud interactions on many tightly coupled atmospheric processes; the non-linear response of clouds to aerosol perturbations across different regimes; and the challenge of extracting robust signals from noisy meteorological observations. The problem is particularly acute in the Arctic, where sparse observational coverage limits model constraints, pristine conditions can lead to unexpected behaviour, and key processes remain poorly understood.

A common way to tackle the challenge of uncertainties arising from aerosol-cloud interactions in climate simulations is to conduct sensitivity experiments using cloud and aerosol microphysics schemes based on different assumptions and parameterisations. By comparing these experiments, key results can be constrained by sampling the range of unavoidable structural uncertainties in the models. Here, we apply this approach to a case study of an extreme, polluted warm air mass in the Arctic that was measured during the MOSAiC Arctic expedition in 2020. We simulated the event in the WRF-Chem-Polar regional climate model both with and without the anthropogenic aerosols from the strong pollution event to study the response of clouds and surface radiative balance. To understand the sensitivity of our results to the choice of model configuration, we tested two distinct, widely-used cloud microphysics schemes.

Initial results showed that the two schemes simulated opposite cloud responses: one predicted a surface cooling from the pollution that was reasonably in line with our expectations of the event, while the other predicted the opposite behaviour in the cloud response and an associated surface warming. These opposing effects seemed to suggest that structural uncertainties in the two schemes relating to clean, Arctic conditions was so strong that it even obscured our ability to understand the overall sign of the surface radiative response to the pollution.

However, since significant model development was required to couple these two cloud microphysics schemes to the aerosol fields in our model, there was another explanation that we couldn’t rule out: a bug in the scheme that was producing the more unexpected results. In this talk, we will explore the challenges of simulating the Arctic climate with a state-of-the-art chemistry-climate model and highlight how examples like this underscore the value of our recent efforts to align our collaborative model development with software engineering principles and Open Science best practices.

How to cite: Lapere, R., Price, R., Marelle, L., Bastien, L., and Thomas, J.: Opposite cloud responses to extreme Arctic pollution: sensitivity to cloud microphysics, or a bug?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19755, https://doi.org/10.5194/egusphere-egu26-19755, 2026.

All statistical tools come with assumptions. Yet many scientists treat statistics like a collection of black-box methods without learning the assumptions. Here I illustrate this problem using dozens of studies that claim to show that solar variability is a dominant driver of climate. I find that linear regression approaches are widely misused among these studies. In particular, they often violate the assumption of ‘no autocorrelation’ of the time series used, though it is common for studies to violate several or all of the assumptions of linear regression. The misuse of statistical tools has been a common problem across all fields of science for decades. This presentation serves as an important cautionary tale for the Earth Sciences and highlights the need for better statistical education and for statistical software that automatically checks input data for assumptions.

How to cite: Steiger, N.: Pervasive violation of statistical assumptions in studies linking solar variability to climate, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19776, https://doi.org/10.5194/egusphere-egu26-19776, 2026.

EGU26-20122 | ECS | Posters on site | EOS4.4

Developing Matrix-Matched Empirical Calibrations for EDXRF Analysis of Peat-Alternative Growth Media 

Thulani De Silva, Carmela Tupaz, Maame Croffie, Karen Daly, Michael Gaffney, Michael Stock, and Eoghan Corbett

A key reason for the widespread use of peat-based growth media in horticulture is their reliable nutrient availability when supplemented with fertilisers. However, due to environmental concerns over continued peat-extraction and use, peat-alternatives (e.g., coir, wood fibre, composted bark, biochar) are increasingly being used commercially. These alternative media often blend multiple materials, making it crucial to understand elemental composition and nutrient interactions between components. This study evaluates whether benchtop Energy Dispersive X-ray Fluorescence (EDXRF) can provide a rapid method for determining the elemental composition of peat-alternative components.

Representative growing media components (peat, coir, wood fibre, composted bark, biochar, horticultural lime, perlite, slow-release fertilisers, and trace-element fertiliser) were blended in different ratios to generate industry-representative mixes. Individual components and prepared mixes were dried and milled to ≤80 μm. An industry-representative mix (QC-50: 50% peat, 30% wood fibre, 10% composted bark, 10% coir, with fertiliser and lime additions) and 100% peat were analysed by EDXRF (Rigaku NEX-CG) for P, K, Mg, Ca, S, Fe, Mn, Zn, Cu and Mo, and compared against ICP-OES reference measurements. The instrument’s fundamental parameters (FP) method using a plant-based organic materials library showed large discrepancies relative to ICP-OES (relative differences: 268–390 084%) for most elements in both QC-50 and peat, with the exception of Ca in QC-50 (11%). These results confirm that the FP approach combined with loose-powder preparation is unsuitable for accurate elemental analysis of organic growing media.

An empirical calibration was subsequently developed using 18 matrix-matched standards (CRMs, in-house growing media and individual component standards). Matrix matching is challenging because mixes are mostly organic by volume, yet variable inorganic amendments (e.g., lime, fertilisers, and sometimes perlite) can strongly influence XRF absorption/enhancement effects. Calibration performance was optimised iteratively using QC-50 as the validation sample, until relative differences were <15% for all elements. When applied to 100% peat, agreement with ICP-OES results improved substantially for some macro-elements (e.g. Mg 10%, Ca 1%, S 19%) but remained poor for most trace elements (28–96%), demonstrating limited transferability of this calibration method across different elements and matrices tested.

Overall, these results demonstrate that loose powder preparation does not provide sufficiently robust accuracy for EDXRF analysis of organic growing media even with meticulous empirical matrix-matched calibration. We are therefore developing a pressed pellet method using a low-cost wax binder to improve sample homogeneity (packing density) and calibration transferability. Twenty unknown mixes will be analysed using both loose powder and pressed-pellet calibrations, and agreement with reference data (ICP-OES) will confirm method validation, supporting the development of EDXRF as a novel approach for growing media analysis.

How to cite: De Silva, T., Tupaz, C., Croffie, M., Daly, K., Gaffney, M., Stock, M., and Corbett, E.: Developing Matrix-Matched Empirical Calibrations for EDXRF Analysis of Peat-Alternative Growth Media, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20122, https://doi.org/10.5194/egusphere-egu26-20122, 2026.

EGU26-20375 | ECS | Posters on site | EOS4.4

From Field to File: challenges and recommendations for handling hydrological data 

Karin Bremer, Maria Staudinger, Jan Seibert, and Ilja van Meerveld

In catchment hydrology, long-term data collection often starts as part of a (doctoral) research project. In some cases, the data collection continues on a limited budget, often using the field protocol and data management plan designed for the initial short-term project. Challenges and issues with the continued data collection are likely to arise, especially when there are multiple changes in the people involved. It is especially difficult for researchers who were not directly involved in the fieldwork to understand the data and must therefore rely on field notes and archived data. They then often encounter issues related to inconsistent metadata, such as inconsistent date-time formats and inconsistent or missing units, missing calibration files, and unclear file and processing script organization.

While the specific issues may sound very case-dependent, based on our own and other’s experiences from various research projects, it appears that many issues recur more frequently than one might expect (or be willing to admit). In this presentation, we will share our experiences with bringing spatially distributed groundwater level data collected in Sweden and Switzerland from the field to ready-to-use files. Additionally, we provide recommendations for overcoming the challenges during field data collection, data organization, documentation, and data processing using scripts. These include having a clear, detailed protocol for in the fieldwork and the data processing steps, and ensuring it is followed. Although protocols are often used, they are frequently not detailed enough or are not used as designed. The protocols might also not take into account the further use of the data, such as for hydrological modelling, beyond field collection. 

How to cite: Bremer, K., Staudinger, M., Seibert, J., and van Meerveld, I.: From Field to File: challenges and recommendations for handling hydrological data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20375, https://doi.org/10.5194/egusphere-egu26-20375, 2026.

In 2014 we developed the Wageningen Lowland Runoff Simulator (WALRUS), a conceptual rainfall-runoff model for catchments with shallow groundwater. Water managers and consultants were involved in model development. In addition, they sponsored the steps necessary for application: making an R package, user manual and tutorial, publishing these on GitHub and organising user days. WALRUS is now used operationally by several Dutch water authorities and for scientific studies in the Netherlands and abroad. When developing the model, we made certain design choices. Now, after twelve years of application in water management, science and education, we re-evaluate the consequences of those choices.

The lessons can be divided into things we learned about the model’s functioning and things we learned from how people use the model. Concerning the model’s functioning, we found that keeping the model representation close to reality has advantages and disadvantages. It makes it easy to understand what happens and why, but it also causes unrealistic expectations. Certain physically based relations hampered model performance because they contained thresholds, and deriving parameter values from field observations resulted in uncertainty and discussions about spatial representativeness.

Concerning the practical use, we found that the easy-to-use, open source R package with manual was indispensable for new users. Nearly all users preferred default options over the implemented user-defined functions to allow tailor-made solutions. Parameter calibration was more difficult than expected because the feedbacks necessary to simulate the hydrological processes in lowlands increase the risk of equifinality. In addition, lack of suitable discharge data for calibration prompted the request for default parameter values. Finally, the model was subject to unintended model use, sometimes violating basic assumptions and sometimes showing unique opportunities we had not thought of ourselves.

C.C. Brauer, A.J. Teuling, P.J.J.F. Torfs, R. Uijlenhoet (2014): The Wageningen Lowland Runoff Simulator (WALRUS): a lumped rainfall-runoff model for catchments with shallow groundwater, Geosci. Model Dev., 7, 2313-2332, doi:10.5194/gmd-7-2313-2014

How to cite: Brauer, C.: Re-evaluating the WALRUS rainfall-runoff model design after twelve years of application, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21915, https://doi.org/10.5194/egusphere-egu26-21915, 2026.

EGU26-2847 | Posters on site | EOS2.4

Validating Research Data Management Core Competencies: A survey of US data librarianship current practices to inform the curricula 

Wade Bishop, Angela Murillo, Ayoung Yoon, and Alex Chassanoff

In the context of massive datasets across disciplines, US higher education institutions provide research data services in their academic libraries and elsewhere on campuses. The core competencies to perform these emerging occupations have been developed through an extensive literature review and focus groups. This presentation will provide results from a survey validation study of current professionals to validate core competencies for research data management (RDM). The sampling frame is of data managers, stewards, curators and any related professionals from a variety of communities including, Academic Research Library (ARL) institutions, International Association for Social Science Information Service and Technology (IASIST), Research Data Alliance (RDA), Committee on Data (CODATA), Research Data Access and Preservation Association (RDAP), Earth Science Information Partners (ESIP), and others. Although US-focused, the survey findings can help determine the most important core competencies to include in any RDM curricula. The curricula resulting from the survey validation is delivered in US information schools (iSchools), but lessons learned could be used to inform curricula in any domain and address the gap in earth and environmental science education.

How to cite: Bishop, W., Murillo, A., Yoon, A., and Chassanoff, A.: Validating Research Data Management Core Competencies: A survey of US data librarianship current practices to inform the curricula, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2847, https://doi.org/10.5194/egusphere-egu26-2847, 2026.

Each summer, the Institute for Geospatial Understanding through an Integrative Discovery Environment (I-GUIDE) project, funded by NSF’s Harnessing the Data Revolution initiative, organizes a Summer School. I-GUIDE’s vision is to “Drive digital discovery and innovation by harnessing the geospatial data revolution.”

 

The I-GUIDE Summer School is a gathering of graduate students, post-doctoral researchers and early career scholars who go on a week-long intellectual journey. The Summer School is not just an event; it's a convergence of minds, ideas, and cutting-edge methodologies to shape the future of geospatial understanding.  The Summer School champions the spirit of Geospatial Convergence Science, leveraging AI, and it is rooted in the belief that some of the most pressing societal challenges demand a collaborative, multidisciplinary approach.

 

I-GUIDE has thus far conducted three highly successful Summer Schools with themes Convergence Science in Action, Leveraging AI for Environmental Sustainability, and Spatial AI for Extreme Events and Disaster Resilience. The three Summer Schools were held at the University Corporation for Atmospheric Research facilities in Boulder, CO, and they share a few common key features:

 

  • Convergence Science in Action: Participants navigate the intersection of various disciplines, strategically integrating knowledge, tools, and modes of thinking. The program emphasizes collaborative and professional interactions, fostering an environment where participants learn to work comprehensively on convergence science problems.
  • Interactive Learning: Participants engage in a week-long immersive experience, collaborating with I-GUIDE members to develop novel solutions to computation- or data-intensive geospatial data science challenges. They delve into geoethics, geo-enabling reproducible and open science, geovisualization, and the latest in geoAI via cloud and high-performance computing.
  • Diverse Application Areas: Each year, the participants address critical topics such as climate change, biodiversity, water security, sustainable development, changes in wildland-urban interface, social science data and ethical implications.
  • Integration of Ethics: Ethical considerations, including Collection Bias and Limitations, Missing Perspectives, Assumption of Homogeneity, and Unintended uses.
  • Independent External Evaluation: Conduct surveys, focus group interviews, and use other evaluation tools to capture participant feedback to improve learning outcomes through continuous evaluation and refinement.
  • Ongoing Engagement: Participants continue to stay engaged with the I-GUIDE project by participating in various events and activities, including attending and presenting at the I-GUIDE forum and giving talks to the broader community via the Virtual Consulting Office.

 

In this presentation, we will provide an overview of the Summer Schools, along with relevant highlights, key outcomes, and the lessons learned. We will discuss the geospatial, computational and AI/machine learning, and collaborative working skills the participants learn and apply to work on the projects, along with the incentives I-GUIDE provides for the participants’ success.

How to cite: Ramamurthy, M.: The I-GUIDE Summer School: An annual learning experience that promotes geospatial convergence science and AI to tackle complex scientific and societal challenges, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8543, https://doi.org/10.5194/egusphere-egu26-8543, 2026.

EGU26-12247 | Posters on site | EOS2.4 | Highlight

Building Foundational Climate Data Skills Through Hands-On Training with ESA-CCI's Essential Climate Variables 

Amina Maroini, Lisa Beck, Sarah Connors, Tonio Fincke, and Eduardo Pechorro

Understanding climate change relies on sustained observations of Essential Climate Variables (ECVs), as defined by the Global Climate Observing System (GCOS). As access to ECVs has expanded in scope and duration, users are increasingly confronted with the complexity of these datasets, including longer time series, different data structures, multiple product versions, and uncertainty estimates. 

To remove common technical barriers, such as installing software and coding libraries or, locating and downloading large datasets, the European Space Agency’s Climate Change Initiative (ESA-CCI) developed a cloud-based, pre-configured JupyterLab environment designed to allow learners to begin working with satellite-derived ESA-CCI climate data within minutes.  

This pre-configured JupyterLab environment supports users by integrating simplified access to decades-long global records of the 27 satellite-derived ESA-CCI ECVs into the ESA CCI Toolbox, a dedicated Python package specifically designed for ESA-CCI data that provides ready-to-use functions, allowing users to focus on visualising and analysing climate signals rather than writing custom code from scratch. 

We present this environment as the foundation for a series of training events that have successfully engaged diverse audiences, including students, early-career researchers, and non-specialist stakeholders1. Through guided notebooks that walk learners  through accessing ESA-CCI data, filtering and aggregating variables, visualising spatial and temporal patterns, and exploring uncertainties and data quality flags, learners gain hands-on, reproducible climate data analysis experience while deepening their understanding of the significance of satellite-derived ECVs and their role in monitoring and interpreting climate change. Our presentation will give the opportunity for conference participants to explore the JupyterLab environment during the PICO session. 

1 https://climate.esa.int/en/climate-change-initiative-training/training-sessions/ 

How to cite: Maroini, A., Beck, L., Connors, S., Fincke, T., and Pechorro, E.: Building Foundational Climate Data Skills Through Hands-On Training with ESA-CCI's Essential Climate Variables, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12247, https://doi.org/10.5194/egusphere-egu26-12247, 2026.

EGU26-12625 | ECS | Posters on site | EOS2.4

Improving the Usability and Adoption of Digital Data Solutions: An Example for Researchers in the Digital Waters Flagship and PhD Pilot  

Mohammad Imangholiloo, Elizabeth Carter, and Ville Mäkinen

Geospatial data are increasingly available openly online, and often they are accessible in multiple ways, including web application programming interfaces (API) by the Open Geospatial Consortium (OGC). However, researchers often continue to rely primarily on manually downloading datasets to their laptops for their daily research activities. This workflow has some disadvantages. For example, if the input data updates often, making sure that all the researchers working on the topic have the exact same dataset available is a manual and an error-prone process. The use of web APIs could provide help for various use cases but requires some IT knowledge that many substance experts may lack. 

To address this challenge, we developed a set of Jupyter Notebook examples designed to support researchers in accessing, exploring, and analyzing geospatial data from APIs in both virtual and local computing environments. The notebooks demonstrate and compare multiple approaches for directly accessing vector, raster, and point cloud data, as well as associated metadata records. We test the notebooks on a course for PhD students related to the Digital Waters Flagship by the Research Council of Finland and evaluate their effectiveness using a questionnaire for the course participants.  

With the proposed approach, we aim to lower technical barriers and facilitate the integration of distributed data into existing research workflows. Ultimately, these practices can support the creation of digital twins of water resources and contribute to intelligent and sustainable water management. 

 

Keywords: geospatial data, data infrastructures, Jupyter notebook, data space, technical barriers 

How to cite: Imangholiloo, M., Carter, E., and Mäkinen, V.: Improving the Usability and Adoption of Digital Data Solutions: An Example for Researchers in the Digital Waters Flagship and PhD Pilot , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12625, https://doi.org/10.5194/egusphere-egu26-12625, 2026.

EGU26-14682 | Posters on site | EOS2.4

Motivations, goals and design of new interdisciplinary Computer Science and Geology degrees at the bachelor’s and master’s levels 

Elizabeth H. Madden, Kimberly Blisniuk, Emmanuel Gabet, and Genya Ishigaki

Today’s geoscience challenges and opportunities, such as those associated with environmental health, energy production, mineral extraction, fresh water and natural hazards, demand that public employees, private sector workers and researchers have skills across the fields of geology, geophysics and computer science. In addition, the integration of computing methods into global culture underscores the need to train professionals that ask key questions and make informed decisions about their best uses. In the context of geosciences, it is critical that people with an understanding of the science manage how computing methods are used to select, store, analyze and organize data, create digital public interfaces, and run models. While challenging, this provides opportunities to expand and renew geoscience education in order to promote its relevance into the future. In light of this, San José State University (SJSU) in San José, California USA, is launching a new bachelor’s degree titled ‘Computer Science and Geology’ and a new master’s degree titled ‘Computational Geoscience’ aimed at training students in both geoscience topics and computer science skills. 

We have designed these programs to provide an integrated educational experience in quantitative methods, computer programming and the gathering, analysis, storage and sustainable management of large environmental, geological, and geophysical data sets. The degrees at both educational levels include an array of courses and broad faculty expertise in the separate departments of Computer Science and Geology at SJSU in data analysis, machine learning, artificial intelligence, geological and geophysical modeling across a range of geoscience topics, and natural hazards assessment. These degrees aim to equip students with applied skills to meet a growing workforce demand, and also ensure that this workforce recognizes the possibilities, limitations and dangers of computing tools and methods. The presence of SJSU in the heart of Silicon Valley, SJSU’s role in the U.S. university system as a primarily undergraduate serving institution, and the success of SJSU at transforming students’ lives through career advancement make this a positive place to launch these interdisciplinary degree programs. Through this presentation, we also hope to learn more about best practices and challenges of initiatives and programs at other universities to help guide the development of these degrees and best meet the needs of students and the future research, public service and private sector workforces.

How to cite: Madden, E. H., Blisniuk, K., Gabet, E., and Ishigaki, G.: Motivations, goals and design of new interdisciplinary Computer Science and Geology degrees at the bachelor’s and master’s levels, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14682, https://doi.org/10.5194/egusphere-egu26-14682, 2026.

EGU26-16396 | ECS | Posters on site | EOS2.4

Co-developing research-assisting AI for water resources professionals: the Digital Waters Flagship’s digital methods course  

Elizabeth Carter, Mehrdad Rostami, Elsa Culler, Omer Abubaker, Mohammad Imangholiloo, Mia Pihlajamäki, Maija Taka, Harri Koivusalo, Pertti Alo-Aho, Hannu Martilla, Mehdi Rasti, Pyry Kettunen, Marko Keskinen, Ville Mäkinen, Juha Oksanen, Petteri Alho, and Björn Klöve

The accelerating complexity of global water challenges—driven by hydrologic intensification, a growing and urbanizing population, and proliferation of observational data—demands a new generation of water‑domain researchers who are both computationally fluent and capable of critically integrating artificial intelligence (AI) into scientific workflows. Yet, most geoscience doctoral programs provide limited training in open, reproducible computational methods, and generic AI tools often underperform in specialized environmental domains while lacking transparent attribution of sources. To address these gaps, the Digital Waters Flagship initiative designed and implemented an innovative doctoral‑level course that integrates open‑science software training with student‑driven co‑development of a domain‑adapted large‑language model (LLM) for hydrologic research assistance.

The course employs a flipped‑classroom model within the Digital Waters Virtual Research Environment (VRE), where students learn standardized, reproducible workflows using a repository structure composed of six core elements spanning data access, processing, modeling, visualization, and computational environments. Exceptional student repositories are publicly disseminated as open digital water use cases. In parallel, doctoral researchers participate in the co‑design of a hydrology‑focused research chatbot, DIWA ReChat, which is trained on authentic student‑generated workflow components and equipped with automatic knowledge‑source attribution to ensure transparency and proper crediting of contributions.

Course outcomes are evaluated through (1) pre‑/post‑assessment of computational competency, (2) evidence of improved reproducibility enabled by shared VRE infrastructure, and (3) empirical improvements in domain‑adapted LLM performance based on both conventional accuracy metrics and student‑designed AI efficacy criteria. Together, the course and chatbot development process demonstrate a scalable model for integrating open‑science education with responsible, domain‑aware AI tool creation. This work highlights a pathway for cultivating computationally capable researchers who can both leverage and critically evaluate AI in support of robust, transparent hydrologic science.

How to cite: Carter, E., Rostami, M., Culler, E., Abubaker, O., Imangholiloo, M., Pihlajamäki, M., Taka, M., Koivusalo, H., Alo-Aho, P., Martilla, H., Rasti, M., Kettunen, P., Keskinen, M., Mäkinen, V., Oksanen, J., Alho, P., and Klöve, B.: Co-developing research-assisting AI for water resources professionals: the Digital Waters Flagship’s digital methods course , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16396, https://doi.org/10.5194/egusphere-egu26-16396, 2026.

EGU26-19756 | Posters on site | EOS2.4

Introduction to Python for Geographic Data Analysis: A new, open resource for teachers and learners 

David Whipp, Henrikki Tenkanen, and Vuokko Heikinheimo

Digital geoscientific and geospatial datasets are rapidly growing in both number and size. These data present powerful new resources for understanding the evolution of the earth, but working with them requires computational skills are not part of typical geoscience curricula at universities. To leverage the power of these growing geoscientific and geospatial data, students need targeted educational resources that provide basic computational skills.

The new textbook Introduction to Python for Geographic Data Analysis provides a framework for learning to work with (geospatial) datasets of varying size from loading the data to producing interactive visualizations of processed data. Part 1 of the book covers the basics of programming using the Python language, introducing both programming concepts and their Python syntax. It also covers the analysis of tabular data using the pandas Python library and the basics of data visualization. Part 2 introduces working with geospatial data, including fundamental geospatial concepts, working with vector and raster data, geospatial data visualization, and loading data from online sources. Part 3 includes several case studies that build on things presented in the first two parts to demonstrate what can be done with the readers’ new skills. Finally, the appendices provide information about best practices in programing, version control with git and GitHub, and other practical coding tips that promote open, reproducible science.

The book materials are freely available online at https://pythongis.org, and we anticipate that hard copies of the book will be available later in 2026. We hope the book will appeal to a broad range of “geo” scientists, including teachers who provide courses on introductory programming or data analysis for geology and geography students, those interested in learning to interact with and batch process large datasets, and those interested in finding open-source alternatives to commercial GIS software packages.

How to cite: Whipp, D., Tenkanen, H., and Heikinheimo, V.: Introduction to Python for Geographic Data Analysis: A new, open resource for teachers and learners, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19756, https://doi.org/10.5194/egusphere-egu26-19756, 2026.

EGU26-20494 | ECS | Posters on site | EOS2.4

Textbook and code: AI for climate scientists 

Simon Driscoll, Kieran Hunt, Laura Mansfield, Ranjini Swaminathan, Hong Wei, Eviatar Bach, and Alison Peard

We introduce a textbook for climate modellers and scientists seeking to learn AI.

Weather and Climate: Applications of Machine Learning and Artificial Intelligence provides a comprehensive exploration of machine learning in the context of weather forecasting and climate research. The authors begin with an introduction to the fundamentals and statistical tools of machine learning, followed by an overview of various machine learning models. Emulation and machine learning of sub-grid scale parametrizations are discussed, along with the application of AI/ML in weather forecasting and climate models. Next, the book delves into the concept of explainable AI (XAI) methods for understanding ML and AI models, as well as the use of generative AI in weather and climate research. It explores the interface of data assimilation and machine learning for weather forecasting, showcasing case studies of machine learning applied to environmental monitoring data. The book concludes by looking ahead to the future of ML and AI in climate and weather-related research, providing references for further reading. This comprehensive guide offers valuable insights into the intersection of machine learning, artificial intelligence, and atmospheric science, highlighting the potential for innovation and advancement in weather and climate research.

How to cite: Driscoll, S., Hunt, K., Mansfield, L., Swaminathan, R., Wei, H., Bach, E., and Peard, A.: Textbook and code: AI for climate scientists, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20494, https://doi.org/10.5194/egusphere-egu26-20494, 2026.

ESSI4 – Advanced Technologies and Informatics Enabling Transdisciplinary Science

Accurate estimation of Origin-Destination (O-D) matrices is fundamental to effective transportation planning. Conventional approaches based on the four-step travel demand model are often time-consuming, data-intensive, and costly, primarily due to their reliance on extensive demographic and socio-economic data. Integrating Remote Sensing (RS), Geographic Information Systems (GIS), and the Global Positioning System (GPS) will be a more efficient and spatially explicit framework for travel demand analysis. This study presents an approach for estimating O-D matrices by establishing a relationship between land-use characteristics and traffic demand. High-resolution CARTOSAT-1 satellite imagery was used to generate updated ward-wise land-use maps for Tiruchirappalli city, Tamil Nadu, India in the absence of recent land-use data. Using GIS-based spatial analysis, land-use categories were quantified and linked to trip generation and trip attraction patterns across sixty wards. Trip production and attraction were estimated based on residential and non-residential land-use proportions, and these estimates were incorporated into a base-year O-D matrix to derive an updated matrix. The resulting O-D matrix was validated through link-level traffic volume comparisons on selected critical road segments. The findings demonstrate that wards with higher residential land-use exhibit greater trip production, while wards dominated by commercial, educational, industrial and public land uses show higher trip attraction. The study highlights the effectiveness of integrating 3S technology in simplifying O-D matrix estimation, reducing data requirements, and supporting cost-effective and reliable urban transportation planning.
Keywords: Land use; Travel demand modelling; O-D matrix; Trip generation

How to cite: Rema, A.: Integration of Remote Sensing and GIS for Origin–Destination matrix estimation in urban areas, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2687, https://doi.org/10.5194/egusphere-egu26-2687, 2026.

EGU26-3523 | Posters on site | ITS1.20/ESSI4.3

The ESA-NASA Joint EO Mission Quality Assessment Framework – Towards a standardized data quality assessment process 

Melissa Martin, Dana Ostrenga, Leonardo De Laurentiis, Peggy Fischer, and Philippe Goryl

The increasing availability of EO data products from a growing number of Commercial EO data providers within the New Space domain represents a great opportunity towards the implementation of a concept that has been discussed at CEOS-level for several years: a Global Earth Observation System of Systems (GEOSS).

A key element towards the implementation of a GEOSS is undoubtedly data interoperability, by all means. This calls for new frameworks and tools to enable data quality and interoperability assessments, allowing a clear, standardized, process and output presentation.

ESA-side, the development of a quality and suitability assessment framework has started within the Earthnet Third Party Missions (TPM) realm, where candidate missions are assessed by the Earthnet Data Assessment Project (EDAP) team prior to integration, with a view to checking whether the mission stated requirements are met. The EDAP team has developed a successful reference set of guidelines, further instantiated by domain (Optical, SAR, DEM, Atmospheric Composition and others), with a view to harmonizing and standardizing the data quality assessments on a per-domain basis.

The EDAP Cal/Val Maturity Matrix and framework have been presented through international forums and conferences, having great success.

NASA-side, data quality assessments are carried out to support integration of commercial satellite data into Earth science research and applications at NASA. The NASA’s Commercial Satellite Data Acquisition (CSDA) Program’s commercial data evaluation process provides critical benefits by ensuring that all acquired datasets meet rigorous scientific standards for accuracy, reliability, and interoperability. Through comprehensive assessments of radiometric and geometric quality, validation against trusted reference data, and transparent documentation requirements, NASA ensures that commercial data can be confidently integrated into research and applications. This approach builds trust in commercial partnerships, accelerates scientific progress, reduces duplication of effort, and promotes cost efficiency by leveraging existing high-quality data. Continuous monitoring further supports long-term integrity and fosters innovation within the Earth observation community

Within the frame of the ESA-NASA International Cooperation and Collaboration through Joint Groups and International Workshops attendance (mainly JACIE and VH-RODA), agreement on joint development and maintenance of the EDAP framework has been reached, officially framing the activity and the framework as an official ESA-NASA Framework. A 1st official signature of the ESA-NASA guidelines for the SAR domain took place in 2024, and further signatures of guidelines covering the other domains are planned within the next years.

The aim of the ESA-NASA guidelines is to maintain an official, transparent and public framework dedicated to data quality assessments of candidate missions to both the TPM and CSDA programmes. At ESA, the guidelines are also used to carry out an operational assessment of missions within the Copernicus Contributing Missions scheme.

The Presentation will focus on the joint guidelines, its usage and main output, namely the Cal/Val Maturity Matrix, and future evolutions.

How to cite: Martin, M., Ostrenga, D., De Laurentiis, L., Fischer, P., and Goryl, P.: The ESA-NASA Joint EO Mission Quality Assessment Framework – Towards a standardized data quality assessment process, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3523, https://doi.org/10.5194/egusphere-egu26-3523, 2026.

Deprived urban areas (e.g., slums and informal settlements), are characterized by substandard housing, inadequate services, and insecure tenure. They represent the physical manifestation of socioeconomic inequalities across rapidly urbanizing in Low- and Middle-Income Countries (LMICs). Populations residing in these areas face compounding challenges including elevated exposure to climate and environmental hazards (e.g., extreme heat). Yet, these communities are often underrepresented in official censuses, limiting efforts to identify and reach those most in need. Earth Observation (EO) and Machine Learning (ML) offer potential to address this gap, yet current mapping approaches produce mostly binary slum/non-slum classifications that obscure the continuous, multidimensional nature of deprivation.

This research develops a morphology-based framework for characterising urban deprivation in LMICs, using Zambia as a primary case study. Rather than training supervised models on binary slum boundaries, we leverage EO-derived urban elements including building footprints, heights, street network characteristics, and spatial arrangement patterns to compute a set of morphometrics at fine spatial resolution. Applying unsupervised ML techniques, we identify distinct morphological signatures across urban areas. To assess whether and how these signatures relate to deprivation, we integrate household-level data from accurately (~3m) geo-coded urban household surveys in Zambia in 2023 with EO imagery to examine associations between physical urban form and non-physical dimensions of deprivation, such as service access and socioeconomic status. Preliminary results will highlight which morphometrics demonstrate robust associations with socioeconomic indicators and how these relationships may vary across different urban contexts, as well as the rural-urban continuum.

The framework responds to the challenge of transforming globally available EO data into locally actionable information. By producing human-interpretable morphological characterizations rather than abstract deep learning features, the approach offers greater transferability across diverse urban settings and facilitates co-creation with local stakeholders who can validate whether outputs align with their understanding of deprivation patterns on the ground.

How to cite: Luo, E. and Tuholske, C.: Characterising Urban Deprivation through Earth Observation: Linking Physical Urban Form to Socioeconomic Conditions in Zambia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5768, https://doi.org/10.5194/egusphere-egu26-5768, 2026.

EGU26-9398 | ECS | Orals | ITS1.20/ESSI4.3

A climate impact taxonomy operationalizing IPCC physical driver and risk concepts  

Michaela Werning, Edward Byers, Marina Andrijevic, Carl-Friedrich Schleussner, Seth Monteith, Laura Aldrete Lopez, Valentin Lemaire, Elin Matsumae, Adelle Thomas, and Alexander Nauels

The Intergovernmental Panel on Climate Change (IPCC) provides comprehensive information on the physical science of climate change in Working Group I (WGI), as well as on climate impacts, adaptation, and vulnerability in Working Group II (WGII). The breadth of information in the latest IPCC assessment report (AR6) can be difficult to navigate, in particular for end users looking for tailored outputs directly linking physical climate changes to the resulting risks for natural and human systems. While efforts have been made to facilitate the assessment of climate impacts and risks, prominent and systematically applied cross-Working Group products are still missing.

To address this gap, we have developed a climate impact taxonomy that pairs the 35 Climatic Impact-Drivers (CIDs) assessed in AR6 WGI with the eight Representative Key Risks (RKRs) identified in AR6 WGII. CIDs represent physical climate conditions that directly affect societal and ecological systems, while RKRs are clusters of key climate-related risks projected to become severe in a warming climate. Each RKR–CID combination is enriched with structured metadata describing spatial scale, type of change, temporal character, and the IPCC assessment of relevant subsystems. Additionally, the metadata include examples of identified research needs, adaptation linkages outlining illustrative responses by risk component and associated relevant targets aligned with the United Nations Framework Convention on Climate Change (UNFCCC) Global Goal on Adaptation, mitigation linkages, and critical global warming levels. References to relevant WGI and WGII chapters of IPCC AR6 and approved chapters for AR7 guide users toward the appropriate sources for further information.

By translating abstract physical climate indicators into actionable information, the climate impact taxonomy prototype—implemented as a machine-readable lookup table—supports end users, such as adaptation planners and policymakers, with more holistic impact and risk assessments.

How to cite: Werning, M., Byers, E., Andrijevic, M., Schleussner, C.-F., Monteith, S., Aldrete Lopez, L., Lemaire, V., Matsumae, E., Thomas, A., and Nauels, A.: A climate impact taxonomy operationalizing IPCC physical driver and risk concepts , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9398, https://doi.org/10.5194/egusphere-egu26-9398, 2026.

EGU26-10771 | Posters on site | ITS1.20/ESSI4.3

Direct and harmonised access to Essential Climate Variable related in-situ observation data from SeaDataNet 

Peter Thijsse, Dick Schaap, Tjerk Krijger, Robin Kooyman, and Paul Weerheim

SeaDataNet is a pan-European infrastructure that manages and provides access to marine datasets collected by European organisations through research cruises and observational activities in coastal waters, regional seas, and the global ocean. It was founded by National Oceanographic Data Centres (NODCs) and major marine research institutes. The network has expanded through successive EU-funded RTD projects and by contributing to major European initiatives such as EMODnet, Copernicus Marine Service, ENVRI, and the European Open Science Cloud (EOSC).

SeaDataNet develops and promotes widely adopted standards, vocabularies, software tools, and services that support FAIR marine data management. Its core service, the CDI (Common Data Index), provides unified online discovery and access to in situ marine observation data managed by more than 115 data centres in 34 countries. The service currently offers access to over 3 million datasets from more than 1,000 European organisations, covering physical, chemical, biological, geological, and geophysical data from European waters and the global ocean. The use of standard metadata, formats, and controlled vocabularies ensures rich, highly FAIR datasets.

SeaDataNet also delivers core data services for EMODnet Chemistry, Bathymetry, and Physics, harmonising large volumes of marine data that support the production of thematic data products, including Essential Climate Variable (ECV) and Essential Ocean Variable (EOV) datasets.

Environmental science increasingly relies on large, heterogeneous, and rapidly growing data collections that must be efficiently accessed, subsetted, and harmonised for use in models, digital twins, AI workflows, and Virtual Research Environments (VREs). The fully open-source Beacon software, developed by MARIS https://beacon.maris.nl/, addresses these challenges by enabling cloud-native, high-performance data lakes that are fast to deploy and access. Beacon supports parameter harmonisation using metadata annotations based on NERC vocabularies, ECV vocabularies, and the I-ADOPT methodology adopted in ENVRI-HUB Next.

To improve the ease of access to subsets of the SeaDataNet CDI data collection, a Beacon instance containing all the open SeaDataNet data was set-up. This now allows users to obtain real-time access to data subsets in multiple data formats (NetCDF, Parquet, Zarr) and flexible querying from Jupyter Notebooks or a newly developed Beacon studio (user interface) for non-technical users. Within ENVRI-HUB Next, this SeaDataNet instance enables on-the-fly access to ECV-related subsets from millions of files via Jupyter Notebooks, ready for use in the Analytical Framework.

The presentation focuses on this use case, the technical solution, and its potential applicability for other Research Infrastructures supporting EOSC use cases.

How to cite: Thijsse, P., Schaap, D., Krijger, T., Kooyman, R., and Weerheim, P.: Direct and harmonised access to Essential Climate Variable related in-situ observation data from SeaDataNet, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10771, https://doi.org/10.5194/egusphere-egu26-10771, 2026.

EGU26-11976 | Posters on site | ITS1.20/ESSI4.3

Co-creating a pan-African Research and Knowledge Infrastructure for societal benefit through climate action: The KADI project. 

Matthew Saunders, Emmanuel Salmon, Theresia Bilola, Niina Käyhkö, Abdirahman Omar, Tommy Bornman, Jörg Klausen, Rebecca Garland, Gregor Feig, Lutz Merbold, Patricia Nying'uro, Christine Mahonga, Money Guillaume Ossohou, and Werner Kutsch

Climate change is having an accelerating impact globally, and across Africa through the increased frequency, magnitude and duration of droughts, fires, floods and other extreme climatic events. Our ability to address this crisis requires policy makers, private enterprise, scientists and society at large to converge and co-create the solutions needed to understand, adapt and mitigate climate impacts. Research infrastructures (RIs) underpin our ability to develop appropriate climate services that address these issues, and through the scientific evidence they deliver, aligned with societal priorities they will reduce vulnerability to climate change and promote sustainable development across Africa.

The Horizon Europe funded KADI project (Knowledge and climate services from an African observation and Data research Infrastructure) has developed a conceptual framework for a pan-African RI that will deliver the science-based climate services required to reduce societal and economic costs of climate change, help to address national, regional and international political agendas and contribute to achieving the UN Sustainable Development Goals. The KADI-RI is driven, supporting successful co-creation and delivery of climate services that are sector relevant and user specific; transdisciplinary in nature integrating academic, non-academic and societal areas; scalable in space and time, producing interoperable and accessible data products; and sustainable in scope, through the incorporation of financial, organisational, technological, social and epistemological longevity.

This presentation will discuss the development of the KADI-RI blueprint using systems mapping approaches in the co-creation of climate services and how these outputs can be used to identify the diverse research networks and interoperable data systems that are essential for understanding climate trends and their associated impacts. KADI pilot studies have demonstrated; 1) how the use of low-cost sensors and citizen science engagement can address issues of air pollution and heat stress in urban environments; 2) how long-term African Union (AU) and European Union (EU) collaborative networks can provide insight into the benefits of long-term meteorological measurements to inform sensor and data analytical requirements, 3) explored how such exchanges can consolidate African networks measuring ocean biogeochemistry and integrate these into global RIs, and 4) examined the interactions between diverse observation networks and the development of earth system modelling and remote sensing capacity. Knowledge exchange activities have been central to the development of the KADI-RI blueprint, facilitating the mobility of scientists across Africa and the EU to attend stakeholder workshops, training courses and to develop communities of practice that will ensure all stakeholders work together to design solutions that reflect regional priorities.

Key recommendations of the KADI project include the need to minimise observational gaps to ensure better data coverage; combine in situ, remotely sensed and modelled data to enhance analytical capabilities; invest in infrastructure and skills; improve access to data products through open data policies and engage and include all communities in data collection and climate service design. This work provides the link between the science-based concept design and the policy cooperation required to develop a functional and collaborative RI that will provide long-term sustainable support to develop local ownership and integration of African climate-services into global observation systems.

How to cite: Saunders, M., Salmon, E., Bilola, T., Käyhkö, N., Omar, A., Bornman, T., Klausen, J., Garland, R., Feig, G., Merbold, L., Nying'uro, P., Mahonga, C., Guillaume Ossohou, M., and Kutsch, W.: Co-creating a pan-African Research and Knowledge Infrastructure for societal benefit through climate action: The KADI project., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11976, https://doi.org/10.5194/egusphere-egu26-11976, 2026.

EGU26-12348 | ECS | Posters on site | ITS1.20/ESSI4.3

Towards Transformative Climate Services: Community Building, Co-Creation, and Communication. Lessons learned from Climateurope2 project. 

Chiara Calderaro, Simone Taddeo, Arianna Acierno, Ljubica Slavković, Marjana Brkić, Marta Terrado Casanovas, and Inés Martin del Real

Climate services play a critical role in bridging scientific knowledge and societal needs, enabling informed decision-making for climate adaptation and mitigation. However, their effectiveness depends not only on scientific robustness, but also on inclusive co-creation processes, shared standards, and strong communities of practice that connect researchers, practitioners, policy makers, and users. This contribution presents the experience of Climateurope2, a European project coordinated by the Barcelona Supercomputing Center, as a case study demonstrating how community-driven approaches can advance trustworthy, accessible, and impactful climate services.

Climateurope2 aims to strengthen and expand the European climate services ecosystem by developing recommendations and standardisation procedures while fostering uptake of quality-assured climate services. Central to the project is the deliberate cultivation of an open and diverse climate services community, built through a wide range of participatory activities that prioritize bottom-up engagement and transdisciplinary exchange. In particular, the project has organized a series of interactive webstivals and festivals designed as co-creation spaces, where researchers, service providers, policy makers, private sector actors, and local stakeholders collaboratively explore needs, methodologies, tools, and future directions for climate services.

These events have facilitated knowledge integration across multiple domains, including Earth observation data, climate modelling, socio-economic analysis, and local knowledge systems, contributing to the development of more user-relevant and context-sensitive climate services. They also address common challenges in the field, such as fragmented data accessibility, limited dialogue between disciplines, and difficulties in scaling services across sectors and regions.

A distinctive feature of Climateurope2 is its strong emphasis on communication as an enabling mechanism for co-creation and societal impact. The project has invested in innovative communication formats and inclusive language to make climate science and services more accessible to policy makers, practitioners, and wider audiences. This effort includes a Traveling Climate Action Roadshow across the Southeast Europe that promotes climate services through the integration of art and science; two dedicated art–science calls designed to foster dialogue between artists and the scientific community, resulting in the creation of artistic works addressing key project themes and translating complex climate service concepts into accessible narratives for wider audiences; and the production of the “Climate at your Service” podcast, which offers an engaging entry point to understanding the role of climate services and their standardisation in supporting climate adaptation and informed decision-making.

By reflecting on lessons learned from community-building, co-creation practices, and communication strategies, this contribution highlights how transdisciplinary collaboration and shared standards can empower a broad range of stakeholders. The Climateurope2 experience offers transferable insights for advancing climate services that are not only scientifically sound, but also socially robust, scalable, and transformative across diverse socio-ecological contexts.

How to cite: Calderaro, C., Taddeo, S., Acierno, A., Slavković, L., Brkić, M., Terrado Casanovas, M., and Martin del Real, I.: Towards Transformative Climate Services: Community Building, Co-Creation, and Communication. Lessons learned from Climateurope2 project., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12348, https://doi.org/10.5194/egusphere-egu26-12348, 2026.

EGU26-13040 | Orals | ITS1.20/ESSI4.3 | Highlight

Strengthening the global system for essential climate variables observations: the iClimateAction project  

Paolo Laj, Belén Martin Miguez, Antonio Bombelli, Caterina Tassone, Martyn Clark, Paola De Salvo, Wenbo Chu, Madeeha Bajwa, and Lorenzo Labrador

The landscape of organizations which have functions along the value chain of climate data, is extremely complex. These functions include the collection, curation and exploitation of climate datawhich ultimately lead to the production of climate information supporting decision making. The EU funded iClimateAction project supports three key organizations in this landscape GCOS, GEO, WMO in their common endeavours to strengthen the global system for standardised, open, accessible, usable, and interoperable observations of essential climate variables (ECVs)The project has for objective providing an assessment of the current Earth Observation value chain for ECVs and identify gaps, and shortcomings that limit full exploitation from observations to services. For that, it will realise: (1) a full assessment of in-situ ECV observation systems: Coverage, gaps, networks at risk, data centres, and best practices for data & metadata stewardship.; (2) A review of space-based ECV data availability: Limitations, continuity challenges, processing stream improvements, and better coordination among space agencies; (3) a systemic analysis of the global ECV observationThe iClimateAction project will foster EO data exploitation, and deliver a set of recommendations for a sustainable interorganization coordination to maximize the value and impact of the EO data chain for climate. 

How to cite: Laj, P., Martin Miguez, B., Bombelli, A., Tassone, C., Clark, M., De Salvo, P., Chu, W., Bajwa, M., and Labrador, L.: Strengthening the global system for essential climate variables observations: the iClimateAction project , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13040, https://doi.org/10.5194/egusphere-egu26-13040, 2026.

EGU26-13413 | Orals | ITS1.20/ESSI4.3

Origin, governance and evolution of the Essential Climate Variables framework managed by the Global Climate Observing System (GCOS) 

Belen Martin Miguez, Peter Thorne, Stephan Bojinski, Carlo Buontempo, Sarah Connors, Carmen García Izquierdo, Isabelle Gartner-Roer, Andreas Güntner, Martin Herold, Stefan Kern, Katrin Schroeder, Blair Trewin, Antonio Bombelli, and Caterina Tassone

The need to understand how climate is changing has never been greater, and we cannot understand what we do not observe.  

This contribution will describe the origin and evolution of a set of essential climate variables (ECVs) that are managed by the Global Climate Observing System (GCOS) programme, through its three expert panels for the atmospheric, oceanic and terrestrial domain.

The ECVs constitute the minimum set of observations required to systematically observe the Earth’s changing climate across three domains: the ocean, land and atmosphere.  ECVs have facilitated the implementation of the observing system through a user-driven process, guiding investment decisions, and mobilizing climate observing communities. The first set of Essential Climate Variables were developed by GCOS in the late 1990’s and since then the list has grown to the 55 current ECVs.  

After 25 years, GCOS has started a process aimed at the rationalization of the ECV list. In this contribution, the main outputs of this rationalization process will be presented: (1) formalization of a governance process to adopt new ECVs; (2) revised definitions for ECVs and ECV quantities; (3) a proposal for an updated set of ECVs. 

The connections between the ECV framework and other frameworks such as the Essential Ocean Variables framework will also be covered.

How to cite: Martin Miguez, B., Thorne, P., Bojinski, S., Buontempo, C., Connors, S., García Izquierdo, C., Gartner-Roer, I., Güntner, A., Herold, M., Kern, S., Schroeder, K., Trewin, B., Bombelli, A., and Tassone, C.: Origin, governance and evolution of the Essential Climate Variables framework managed by the Global Climate Observing System (GCOS), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13413, https://doi.org/10.5194/egusphere-egu26-13413, 2026.

EGU26-14063 | Posters on site | ITS1.20/ESSI4.3

Implementing the EU Methane Emissions Regulation through policy-relevant emissions data and a collaborative approach. 

Valeria Di Biase, Daniel Zavala-Araiza, and Léa Pilsner

Methane emissions are a major driver of near-term climate warming across multiple sectors, and the European Union Methane Emissions Regulation (EUMER) represents a critical and timely policy instrument to address methane emissions from fossil fuels produced in the EU as well as those supplied to the EU. As EUMER enters its phased implementation, the operationalization of its wide-ranging and technically complex regulatory requirements necessitates the development of new data workflows, coordination mechanisms, and cross-disciplinary approaches to support actionable and accessible knowledge for a broad set of stakeholders beyond public authorities.

This contribution frames the implementation of EUMER as a data-driven process that requires bringing together distinct perspectives from industry, regulators, and local communities. Drawing on experiences from civil society initiatives that establish networks of organizations assessing and tracking implementation progress across the EU, we examine how empirically based data tools are being used to increase transparency and support effective mitigation. We further analyse emerging institutional configurations and collaborative practices that enable stakeholder engagement and regulatory oversight. We discuss key challenges related to data accessibility, transparency, comparability, and communication in the context of methane reporting and mitigation requirements, including issues arising from diverse emission sources, supply chains, and institutional responsibilities. Particular attention is given to the integration of multiple data streams - including Earth observation products, facility-level reporting, and international datasets such as those developed by the International Methane Emissions Observatory (IMEO) - to support the design and evaluation of affordable and accessible monitoring tools.

Our work will illustrate how this data-driven, multi-stakeholder implementation framework for methane mitigation can serve as a blueprint for similar approaches for emissions from other sectors and gases.

How to cite: Di Biase, V., Zavala-Araiza, D., and Pilsner, L.: Implementing the EU Methane Emissions Regulation through policy-relevant emissions data and a collaborative approach., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14063, https://doi.org/10.5194/egusphere-egu26-14063, 2026.

EGU26-14450 | ECS | Posters on site | ITS1.20/ESSI4.3

FOCAL Urban Pilot: Efficient exploration of climate data locally for data-driven decision-support in urban climate adaptation planning 

Jan-Christopher Cohrs, Guy Brasseur, Suyeon Choi, Radovan Hilbert, Eva Klien, Kevin Kocon, Muthu Kumar, Noribeth Mariscal, Jiří Matějka, Elke Moors, Klára Moravcová, Ondřej Podsztavek, Eric Samakinwa, Ingo Simonis, Slavomir Sipina, Tim Tewes, Hendrik M. Würz, and Diana Rechid
Climate change impacts are increasingly manifested at local scales, where mitigation and adaptation strategies are implemented. Despite the growing wealth of available climate data and services, their effective usage in local climate impact assessment and decision-making processes for mitigation and adaptation planning remains limited due to scale mismatches, computational constraints, complexity, and usability barriers for non-domain experts. Addressing these challenges requires both advanced computational methods and improved access to climate data and analysis tools.
 
The EU Horizon project, FOCAL, bridges the gap between data, services, and their users by implementing an open compute platform that combines intelligent workflow management with high-performance computing (HPC) resources to allow for an efficient exploration of climate data on a local scale. In addition, innovative artificial intelligence (AI) tools are developed and made available to enhance climate data analysis in terms of speed, robustness, pattern detection, and localization; thereby expanding the toolkit of climate data analysis and impact assessment methods.
 
A main objective of FOCAL is to support science-based, actionable decision-making processes in forestry and urban planning through its provided tools. In a co-design process involving developers and potential platform users from two forest pilot regions with contrasting ecological and management contexts (Forest Pilots) as well as a pilot city (Urban Pilot), web applications for intuitive user-platform-interaction and workflows, grounded in state-of-the-art climate science, to address concrete user questions in forestry and urban planning have been specified. As a result, decision makers can efficiently use climate data for the development of climate adaptation strategies.

This contribution focuses on the Urban Pilot, implemented for the pilot city Constance (Baden-Württemberg, southern Germany), located at the western end of Lake Constance. Three core workflows have been developed:
1) Regional climate change workflow: provision of robust regional climate change information for the past and the future under different global warming levels for urban areas, based on regional climate model and localized climate data, serving multi-sectoral local climate impact assessments;
2) Urban hot and cool spot workflow: detection and high-spatial-resolution visual exploration of hot and cool spots in urban environments, supporting exposure assessment by integrating additional data (e.g., population or infrastructure data), risk assessment, and the planning of urban heat resilience measures and cooling spaces;
3) Urban blue spot workflow: identification of blue spots (rainfall accumulation hazards) and provision of blue spot data in urban landscapes using processed precipitation data and extreme precipitation scenarios, supporting applications in hydrological modeling, flood risk management, and climate adaptation.

By leveraging HPC-based data processing and AI-assisted analysis, these workflows translate complex climate data into actionable, locally relevant information. While demonstrated for the pilot city Constance, the methods and workflows are transferable to other urban areas, contributing to scalable and reproducible climate services.

How to cite: Cohrs, J.-C., Brasseur, G., Choi, S., Hilbert, R., Klien, E., Kocon, K., Kumar, M., Mariscal, N., Matějka, J., Moors, E., Moravcová, K., Podsztavek, O., Samakinwa, E., Simonis, I., Sipina, S., Tewes, T., Würz, H. M., and Rechid, D.: FOCAL Urban Pilot: Efficient exploration of climate data locally for data-driven decision-support in urban climate adaptation planning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14450, https://doi.org/10.5194/egusphere-egu26-14450, 2026.

EGU26-14538 | Orals | ITS1.20/ESSI4.3

Enabling access to harmonised ECV-related observation datasets from environmental Research Infrastructures 

Alexandra Kokkinaki, Peter Thijsse, Gwenaelle Moncoiffe, Tjerk Krijger, Marta Gutierrez, Maggie Hellström, Claudio Dema, Alessandro Turco, Delphine Dobler, Ulrich Bundke, and Markus Fiebig

In the ENVRI-Hub-NEXT (EHN) project, environmental Research Infrastructures (RIs) collaborate within the European Open Science Cloud (EOSC) to improve access to observation datasets related to Essential Climate Variables (ECVs). The main goal is to enable users from any Virtual Research Environment (VRE) to process and analyse ECV-related data using ENVRI-Hub components. To support this, EHN provides a GUI-based Catalogue of Services (CoS) that describes RI services and datasets using an extension of DCAT (EPOS-DCAT-AP), complemented by a Catalogue of Data based on FAIR Data Points. Despite this, dataset discovery and federation remain challenging due to heterogeneous machine-to-machine services and differing vocabularies for observed variables.

To address these issues, an ECV Working Group was established within EHN to define an approach for matching ECVs as defined by the Global Climate Observing System (GCOS) to the diverse variables managed by RIs across multiple environmental domains. An ECV is defined as a physical, chemical or biological variable, or a group of linked variables, that is critical for characterising Earth’s climate. A key outcome was the publication of ECV concepts linked to the GCOS definitions, in a machine-readable vocabulary in the NERC Vocabulary Server (NVS). It enabled mappings to RI-specific vocabularies using the I-ADOPT approach, and the use of SPARQL queries to establish dynamic “ECV to observable properties” translations. Python notebooks were developed to interact with RI data access services, including a central notebook that translates a single ECV request into multiple RI-specific queries and data access requests. This work exposed limitations in the vocabularies used for observed parameters, as well as in the availability of direct and harmonised data access services.

In the next phase of EHN, several upgrades are planned to improve data accessibility and usability. All RIs will receive training on describing observational datasets using I-ADOPT-compliant vocabularies following recommended practices. 

Because RI machine-to-machine services rely on different APIs and constraints, they cannot be queried uniformly. The ECV data access library developed earlier in the project translates a single ECV request into the multiple requests required to query relevant RI services, using I-ADOPT mappings to identify RI parameter sets. This library will be further optimised, while RIs work towards more harmonised and direct data access services.

Many RIs still lack direct data access and especially subsetting capabilities, instead offering file-based or aggregated access via metadata search. Experience gained through the notebooks will guide improved integration of available services into the CoS. All notebooks and scripts will be released as open source and integrated into the ENVRI-Hub Analytical Framework, including a JupyterLab extension. As analytical services require harmonised data chunks rather than heterogeneous files, the next stage will test subsetting solutions such as Beacon, ERDDAP and Zarr.

The presentation will highlight the implemented solutions and opportunities for broader uptake within the EOSC domain.

How to cite: Kokkinaki, A., Thijsse, P., Moncoiffe, G., Krijger, T., Gutierrez, M., Hellström, M., Dema, C., Turco, A., Dobler, D., Bundke, U., and Fiebig, M.: Enabling access to harmonised ECV-related observation datasets from environmental Research Infrastructures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14538, https://doi.org/10.5194/egusphere-egu26-14538, 2026.

EGU26-14938 | Orals | ITS1.20/ESSI4.3

EarthCODE: Transforming Earth Observation Research into Action-Ready Information through Open Science 

Deyan Samardzhiev, Krasen Samardzhiev, and Ewelina Dobrowolska

EarthCODE (https://earthcode.esa.int) is a strategic ESA EO initiative to support the implementation of the Open Science and Innovation Vision included in ESA’s EO Science Strategy (2024). 

Collaboration and federation are at the heart of EarthCODE. First, EarthCODE integrates a wide range of available EO cloud computing platforms and services, including engineering support; second, it catalogs and manages the FAIR and open data, code, and documentation from ESA Earth System science studies and experiments so they can be discovered, reused, and adapted to new contexts; third, it builds a community of practice of Open Science in Earth Observation science, supported by targeted community trainings, especially with the ESA Science Clusters - and by providing an open forum for discussion and co-creation. The initiative helps scientists discover, visualize, explore, reuse, modify, and build upon the research of others in a fair and safe way, as well as to create end-to-end reproducible workflows on EO cloud platforms – aiming to maximize the utilization of data products and workflows for Earth Action and to systematically transform scientific data into actionable information usable in downstream applications for decision making. 

EarthCODE actively supports initiatives across the Earth system sciences by providing practical development, code and data management tools, and an overall open science framework. One such example is the creation of analysis-ready data (ARD) cubes for the Antarctica Insync initiative. This resulted in FAIRification process to make complex data readily available for modelling (e.g. https://discourse-earthcode.eox.at/t/antartica-insync-data-cubes/107) and visualization (e.g. https://esa-earthcode.github.io/polar-science-cluster-dashboard/). By pre-integrating these diverse datasets, EarthCODE removes the burden of complex data engineering (such as reprojection and resampling), allowing downstream users to immediately apply these inputs to environmental monitoring and decision-making systems. Other examples of this support from EarthCODE can be seen with published datasets such as WAPOSAL and SMART-CH4 and others, enabling research outputs to be translated into actionable, accessible, relevant datasets. On top of that, to facilitate the bank of examples was developed to demonstrate good data management practices and encourage collaboration across scientific teams (https://esa-earthcode.github.io/documentation/Community%20and%20Best%20Practices/).   

FAIR data collections such as the one above and many more value-added geophysical products are made available by EarthCODE through its Open Science Catalog (https://opensciencedata.esa.int) which provides harmonized access to wide range of products across all Earth system science domains. While many catalogues prioritise openness and access, EarthCODE goes beyond by focusing on FAIRness. EarthCODE leverages open-source geospatial technologies like stac-browser, pycsw, PySTAC, OpenLayers  and others - while also contributing back to these projects in terms of software and standardization. The osc python library complements the OSC by providing a programmatic interface to search for and access catalogued research data for analysis. 

How to cite: Samardzhiev, D., Samardzhiev, K., and Dobrowolska, E.: EarthCODE: Transforming Earth Observation Research into Action-Ready Information through Open Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14938, https://doi.org/10.5194/egusphere-egu26-14938, 2026.

Sustainable roofs, including green roofs (GR) and photovoltaic (PV) roofs, are increasingly used as essential components of urban green infrastructure and building-scale renewable energy systems that support climate resilience and environmental quality in high-density cities. However, large-scale, spatially explicit analyses of sustainable roofs in urban core areas remain limited due to data scarcity and the difficulty of reliably distinguishing roof types. Recent advances in deep learning (DL)-based remote sensing have enabled automatic mapping of sustainable roofs at the city scale, but empirical applications remain scarce in Chinese megacities, and systematic comparisons within and across cities are still rare. To address this gap, we adopted a DL-based framework for sustainable roof identification and applied it to eight representative central business districts (CBDs) in two major cities (Guangzhou and Shenzhen, China). High-resolution satellite imagery was used to automatically detect GR and PV roofs, and spatial statistical analyses were conducted to examine their distribution patterns, compositional characteristics, and differences both within and between cities. The results reveal significant variations in the spatial configuration and composition of sustainable roofs across CBDs, reflecting disparities in development intensity, functional structure, and architectural form. This study highlights intra- and inter-city differences in sustainable roof deployment in high-density urban cores and provides empirical evidence to support context-appropriate planning and implementation strategies for sustainable roofs amid rapid urbanization.

How to cite: Liu, H., Wang, M., and Liu, K.: Deep learning–based mapping and spatial patterns of sustainable roofs in high-density urban CBDs: Evidence from Guangzhou and Shenzhen, China, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15413, https://doi.org/10.5194/egusphere-egu26-15413, 2026.

EGU26-16592 | Posters on site | ITS1.20/ESSI4.3

Operationalising essential ocean variables through robust and trusted QCV Workflows 

Jérôme Detoc, Virginie Racapé, Marie Jossé, Clément Weber, Delphine Dobler, Catherine Schmechtig, Alban Sizun, and Thierry Carval

Essential Ocean Variables (EOVs) play a central role in  global ocean observation frameworks. They support the monitoring of biogeochemical processes, ecosystem dynamics, and long-term environmental change. Among them, nitrate is a key biogeochemical EOV, closely linked to primary production and phytoplankton dynamics. And transforming raw observations into reliable, interoperable, and reusable EOV products remains a major operational challenge.

The global Argo programme enables unprecedented global monitoring of ocean biogeochemistry through autonomous profiling floats sampling the ocean from 2000 m depth to the surface; at the same time, the sensitivity of biogeochemical sensors to drift, biofouling, and instrumental issues necessitates expert-driven Qualification, Calibration, and Validation (QCV), which operates within a fragmented ecosystem of tools, data formats, execution environments, and methodological practices.

This contribution presents a complete nitrate QCV workflow, illustrating in concrete terms how validated EOV products are obtained from raw Argo observations. The workflow integrates global Argo data access, data harmonisation, preparation for visual inspection, expert-driven qualification using Ocean Data View, tracking of manual decisions, nitrate calibration, and delayed-mode data production. Each step is documented, connected, and explicitly handled to ensure traceability of both automated processing and human interventions.

The service implementation relies on the Galaxy platform, which provides an open, web-based, and FAIR-oriented environment to orchestrate independent domain tools together with expert-defined QCV procedures into complete, reusable, and transparent workflows. These workflows are accessible to expert users without advanced programming skills. Rather than replacing existing tools, the approach aims to make them work together in a coherent, unified, traceable, and reproducible way, through fixed processing chains covering the full QCV process.

The QCV service will be deployed within the European Open Science Cloud (EOSC), building on thematic infrastructures coordinated by ENVRI, on platform services provided by NFDI, and on operational deployment ensured by Data Terra, in order to guarantee accessibility, interoperability, and long-term reuse.

Regardless of the selected presentation format, this contribution will introduce the EOV framework and the challenges associated with biogeochemical Argo data, before providing a concrete illustration of a complete nitrate QCV workflow. It will then detail the service implementation through interoperable workflows on the Galaxy platform and its deployment within the European Open Science Cloud (EOSC).

How to cite: Detoc, J., Racapé, V., Jossé, M., Weber, C., Dobler, D., Schmechtig, C., Sizun, A., and Carval, T.: Operationalising essential ocean variables through robust and trusted QCV Workflows, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16592, https://doi.org/10.5194/egusphere-egu26-16592, 2026.

EGU26-16642 | ECS | Posters on site | ITS1.20/ESSI4.3

Satellite-Derived Trends in Cloud Cover over Bavaria 

Imke Schirmacher, Thomas Popp, Tobias Ullmann, and Tanja Kraus

Cloud cover trends are highly relevant for the energy and health sectors, as clouds affect the radiation balance and thereby influence parameters such as air temperature and UV index. In particular, during heat waves, cloud-induced reductions of nocturnal cooling rates are of considerable interest. For effective climate adaptation and mitigation, cloud cover trends must be assessed at fine spatial scales and with sufficient temporal resolution to distinguish at least between day- and nighttime conditions. Within the Bavarian state-funded EO4CAM (Earth Observation Laboratory for Climate Adaptation and Mitigation) project, which aims to leverage spaceborne Earth observation and model data to support climate change adaptation and mitigation, we derive spatially resolved cloud cover trends over Bavaria from spaceborne observations between 2004 and 2019 for three-hourly time slots at monthly resolution.

The analysis is based on data from the Spinning Enhanced Visible and Infrared Imager (SEVIRI) aboard Meteosat Second Generation (MSG). We apply a Mann-Kendall trend analysis to the Optimal Cloud Analysis Climate Data Record [1], which provides a homogeneous long-term record of cloud properties. The dataset has a temporal resolution of 15 minutes and a spatial resolution of 6x6 km² over Bavaria.

A generalization of cloud cover trends is precluded by their strong spatial, seasonal, and diurnal dependence. On the one hand, however, cloud fraction typically increases during daytime due to enhanced convective activity. On the other hand, the temporal evolution over the years within a given calendar month is similar across different daytime hours.

As an example, cloud cover over Bavaria at noon in August typically ranges between 60 and 80%, exceeding 80% in the Alpine region. Between 2004 and 2019, trends are predominantly negative across Bavaria, reaching values of up to -1.5 percentage points per year, with the strongest statistical significance observed in northern Bavaria. In contrast, cloud cover trends in the Alpine region remain largely neutral. A more detailed classification shows an increase in the number of days with low (<15%) and medium (15–85%) cloud fractions throughout Bavaria, accompanied by a decrease in days with high (>85%) cloud fraction. These changes are most pronounced in northern Bavaria.

[1] EUMETSAT. Optimal Cloud Analysis Climate Data Record (Release 1): MSG, 0°. 2022. doi: 10.15770/EUM_SEC_CLM_0049. url: https://user.eumetsat.int/catalogue/EO:EUM:DAT:0617/access.

How to cite: Schirmacher, I., Popp, T., Ullmann, T., and Kraus, T.: Satellite-Derived Trends in Cloud Cover over Bavaria, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16642, https://doi.org/10.5194/egusphere-egu26-16642, 2026.

Climate is a key determinant of tourism patterns and destination viability, and climate services offer a promising pathway to support climate-resilient tourism planning. This contribution presents a set of co-created climate indicators designed to assess the spatio-temporal climate suitability of key tourist activities in Costa Daurada and Terres de l’Ebre, two major coastal destinations in Catalonia that are highly exposed to climate variability and change. Building on previous participatory workshops with tourism stakeholders, the study selects priority activities – such as beach tourism, hiking, and cultural gastronomy – and links them to relevant climate variables, including temperature, precipitation, wind, significant wave height, and sunshine duration – among others.

Using reanalysis climate data and activity-specific thresholds, the indicators are computed for present climate conditions to characterise favourable, acceptable and unfavourable periods and locations for each activity. The results provide a detailed picture of the region's tourism climate potential, highlighting both current strengths and emerging vulnerabilities related to heat stress and changing rainfall patterns. The co-created indicators translate complex climate information into decision-relevant metrics that can be directly used by destination managers, policymakers and tourism businesses to adjust products, marketing and infrastructure, and to design adaptation pathways for coastal tourism. Beyond the case study, the work illustrates how co-created climate indicators can strengthen climate services for tourism, contributing to the implementation of climate-resilient strategies and to broader sustainability agendas at regional and international levels. The results also aim to contribute to the Catalan strategy of climate change adaptation (ESCACC30).

How to cite: Boqué Ciurana, A. and Aguilar, E.: Co-created climate indicators for assessing the spatio-temporal suitability of key tourist activities in Costa Daurada and Terres de l’Ebre, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16680, https://doi.org/10.5194/egusphere-egu26-16680, 2026.

EGU26-16912 | Posters on site | ITS1.20/ESSI4.3

Building bridges for sustainable water management - the UNESCO International Initiative on Water Quality 

Moritz Heinle, Philipp Saile, Stephan Dietrich, and Luna Bharati

The International Initiative on Water Quality (IIWQ), established in 2012 by the Intergovernmental Hydrological Programme (IHP) of UNESCO, addresses global water quality issues and combats the degradation of freshwater resources which endangers human health and ecosystems. The IIWQ provides a collaborative network of scientists, practitioners and policymakers for joint research and knowledge exchange on water quality monitoring and management.

During the previous two IHP phases (VII, 2008–2013 and VIII, 2014–2021), the IIWQ has contributed to basin-level water quality assessments, for example in the Kharaa and Selenge River Basins in Mongolia and Russia. The IIWQ also investigated the effects of emerging pollutants on freshwater resources, and published the open-access book “Emerging pollutants: protecting water quality for the health of people and the environment”. Additionally, the IIWQ developed lake-level and global remote sensing water quality portals.

During the ongoing ninth phase of the IHP (2022-2029) “Science for a Water Secure World”, the IIWQ is now co-led by the International Centre for Water Resources and Global Change (ICWRGC) and the UNESCO Chairs on Sustainable Water Security and Water, Energy and Disaster Management for Sustainable Development (WENDI).

In order to support the implementation of the IHP-IX strategy, the IIWQ focuses on the following outputs:

  • Enhanced mobilization of remote sensing technologies by water quality management authorities.
  • Simplified planning and implementation of water quality monitoring programmes and water management plans.
  • Increased awareness and predictability of the effects of emerging pollutants and hydrological extremes on water quality.
  • Amplified visibility for the importance of water quality and its relation to the UN system and SDGs.

This conference contribution provides a more detailed introduction to the IIWQ, focusing on activities during the current IHP-IX phase and highlighting associated engagement opportunities.

How to cite: Heinle, M., Saile, P., Dietrich, S., and Bharati, L.: Building bridges for sustainable water management - the UNESCO International Initiative on Water Quality, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16912, https://doi.org/10.5194/egusphere-egu26-16912, 2026.

EGU26-17602 | ECS | Posters on site | ITS1.20/ESSI4.3

GeoGPT for action ready flood and disaster risk geo-intelligence in Florida 

Nikolaos Tziolas, Golmar Golmohammadi, and Anastasia Kritharoula

Extreme weather in Florida can result compound impacts, crop damage, prolonged waterlogging, and inundation, that disrupt farm activities and complicate field scale assessment. Following an event, extension agents and growers typically need information on short timelines related to crop damage assessment to prioritize scouting, report impacts, and support recovery decisions, and flood-prone area information to anticipate where standing water and access constraints will persist and where follow-up interventions should be targeted. However, producing these products from Earth observation (EO) analysis-ready data (ARD) often requires fragmented geospatial tools, intensive preprocessing, and repeated iterations that delay action.

We present GAIA Bot, a conversational AI-based geospatial assistant piloted in Florida with extension agents and growers to convert EO ARD into action-ready information (ARI) for post-event decision support. In the Florida pilot workflow, users can interact with GAIA Bot through natural-language questions (e.g., “Which fields show likely damage since the storm?”; “Where are the flood-prone low areas that may remain saturated?”; “How does my field compare to the same period in prior year?”). GAIA Bot translates each request into an executable sequence that integrates publicly available spaceborne (e.g. Sentinel-2) observations with contextual geospatial layers (e.g., terrain and drainage proxies) and AI classifiers to generate field-scale damage indicators and priority scouting hotspots, flood-prone area maps that inform access and recovery planning, along with concise explanations for stakeholder communication.

Operational testing with end growers and extension agents indicates significant time savings relative to traditional multi tool approaches, enabling faster product generation and more frequent updates as new satellite observations become available. To support trustworthy decisions, we also explore a reasoning mechanism that produces structured evidence trails.

How to cite: Tziolas, N., Golmohammadi, G., and Kritharoula, A.: GeoGPT for action ready flood and disaster risk geo-intelligence in Florida, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17602, https://doi.org/10.5194/egusphere-egu26-17602, 2026.

EGU26-18690 | Orals | ITS1.20/ESSI4.3

Advances of the ESA Climate Change Initiative (CCI): Progress, Integration, and Future Directions 

Sophie Hebden, Sarah Connors, Simon Pinnock, Eduardo Pechorro, Amy Campbell, Anna Trofaier, Freya Muir, Michael Eisinger, Paul Fisher, Clement Albergel, Susanne Mecklenburg, Klara Gunnarsson, Claire MacIntosh, and Eleanor O'Rourke

Systematic observations are essential for understanding the climate system and the changes that are rapidly unfolding. The ESA Climate Change Initiative (CCI) was established to meet the needs of the UNFCCC, supporting the development of long term data records of the Essential Climate Variables (ECVs) defined by GCOS that could most easily be addressed by satellite remote sensing.  

Since 2009 the CCI programme has built-up European expertise by supporting more than 30 projects that are addressing ECVs, each of which produces multiple data products with detailed documentation to meet the needs of the climate research community and support countries’ goals under the Paris Agreement. Much of this research has been taken up by climate services for the operational production of data, most notably via the Copernicus Climate Change Service (C3S).

Co-developed with C3S, ESA CCI has pioneered common data standards, SI traceability, uncertainty characterisation, validation and evaluation processes and detailed product documentation. Furthermore, the metadata requirements from the World Climate Research Programme’s obs4MIPs effort are met by projects on a case-by-case basis, ensuring suitability for climate model evaluation. To date, interoperability and consistency between ECV data records have been more difficult issues to address, but are the target of the next phase of the programme (2026-2029), informed by recent cross-ECV project work. 

This presentation highlights lessons learnt and future advancements in the CCI programme, with specific examples of how the programme’s integration with strategic partners is supporting improvements for data users, and how the ECV projects are directly working with reporting agencies and contributing to policy need. With the expansion of the Copernicus Sentinel missions up to 2030, and an increasingly diversified landscape of climate data providers, ESA aims to expand its role as custodian and developer of satellite-based ECVs, ensuring European expertise in this area is leveraged to support policy needs for understanding climate change, and tracking mitigation and adaptation action.  

How to cite: Hebden, S., Connors, S., Pinnock, S., Pechorro, E., Campbell, A., Trofaier, A., Muir, F., Eisinger, M., Fisher, P., Albergel, C., Mecklenburg, S., Gunnarsson, K., MacIntosh, C., and O'Rourke, E.: Advances of the ESA Climate Change Initiative (CCI): Progress, Integration, and Future Directions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18690, https://doi.org/10.5194/egusphere-egu26-18690, 2026.

EGU26-19527 | Orals | ITS1.20/ESSI4.3

Open data, disruptive technologies and community approaches in co-creation of climate services in urban Africa – The Resilience Academy approach 

Niina Käyhkö, Patricia Nying'uro, Venla Aaltonen, Nelly Babere, and Christine Mahonga

African cities are experiencing rapid growth, with projections indicating that the majority of the continent’s population will become urban dwellers in the near future. However, this urban expansion is largely unplanned, often resulting in development on hazardous lands with limited regulatory controls and insufficient risk information. Consequently, cities are becoming increasingly vulnerable to the impacts of climate change, as climate risks manifest in more complex and multidimensional ways.

A critical challenge faced by African cities is the lack of baseline knowledge and digital data necessary for informed decision-making and effective management of climate-related risks. The fast-paced transformation of urban landscapes drives an urgent need for climate risk information that offers higher resolution, improved timeliness, and greater update frequency. Additionally, there is a need for data that better captures the interactions among socioeconomic factors, environmental conditions, and physical infrastructures.

To support informed and sustainable urban development, digital data production models and future climate services need to be transformative. Climate services, which are locally driven, contextually appropriate, possess low complexity and fit for purpose ensure that the data and decisions are reliable, locally owned, and actionable over time. For wider scalability and transfer, it is important that co-production models and data-driven climate service solutions can be adopted more widely in African cities.

Resilience Academy (RA) is a university-driven partnership model, which aims to improve climate resilience in urban Africa though co-creation of demand-driven, locally sustainable and scalable climate services operating in the nexus of the digital revolution, community engagement and local youth skills. RA an action-oriented and collaborative ecosystem, which thrives from open data, affordable technologies, skills development and inclusive participation of multiple actors. It builds particularly on the talent and commitment of young generation scientists and students, and local residents changing the ways cities are mapped, designed and managed for the future. Resilience Academy approach seeks to establish tangible co-benefits around co-created climate services by strengthening youths’ digital skills and future employment opportunities in cities.

Our presentation will discuss experiences of applying Resilience Academy approaches in mapping climate adaptation needs, collecting climate risk related digital data and co-creation of urban climate services to address communities’ adaptation to heat, pollution and flooding stressors in Dar es Salaam and Nairobi. In our presentation, we will share challenges, good practices and lessons learnt related to using low-cost digital tools and working with local communities and youths in vulnerable urban neighbourhoods. We will discuss opportunities and challenges related to wider adoption and scaling of RA -approaches for climate service provision across African cities.

How to cite: Käyhkö, N., Nying'uro, P., Aaltonen, V., Babere, N., and Mahonga, C.: Open data, disruptive technologies and community approaches in co-creation of climate services in urban Africa – The Resilience Academy approach, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19527, https://doi.org/10.5194/egusphere-egu26-19527, 2026.

EGU26-20251 | ECS | Orals | ITS1.20/ESSI4.3

Blue-Cloud 2026 workbenches for Essential Ocean Variables: advancing harmonization and big-data workflows for eutrophication in marine science. 

Nydia Catalina Reyes Suarez, Robin Kooyman, Gwenaelle Moncoiffe, Sebastian Mieruch, Delphine Leroy, Alessandra Giorgetti, Julie Gatti, Athanasia (Sissy) Iona, Virginie Racape, Lotta Fyrberg, Megan Anne French, Karin Wesslander, and Marine Vernet

Essential Ocean Variables (EOVs) in ocean chemistry such as temperature, salinity, chlorophyll, nutrients and dissolved oxygen are critical for ocean monitoring and policy, particularly for assessing eutrophication and ocean acidification. These topics are recognized as priorities in global and regional frameworks, including the Sustainable Development Goals (SDG) and the Marine Strategy Framework Directive (MSFD). Despite their importance, implementation remains fragmented across infrastructures like EMODnet, Copernicus, and the World Ocean Database (WOD). The resulting datasets are large, vary in metadata standards, and are processed using diverse methodologies, thus creating significant challenges for interoperability and effective reuse.

Blue-Cloud addresses these challenges by acting as an open science platform for collaborative marine research, contributing to the European Digital Twin of the Ocean (EDITO) and serving as a marine science node of the European Open Science Cloud (EOSC). Built on the D4Science e-Infrastructure, it provides seamless access to services for storing, managing, analyzing, and reusing research data across disciplines. Within the goals of the Blue-Cloud 2026 project is to develop, validate, and document analytical big data workbenches to produce harmonized and validated data collections for selected EOVs in physics, chemistry, and biology.

These workbenches harmonize, integrate, validate and qualify large, heterogeneous in situ data collections from major European and global infrastructures and expose cloud-based workflows in their virtual research environments (VRE). Precisely, the workbench for eutrophication integrates Copernicus, WOD and EMODnet Chemistry's validated datasets with Beacon, a high-performance data-lake solution that enables rapid sub-setting and harmonized delivery of multi-source data, and employs webODV for exploration, initial validation and subset extraction, to support quality control and product generation. A crucial step after merging the data is the identification and management of duplicate records. When merging datasets from multiple sources, duplicates can arise due to overlapping sampling campaigns, repeated submissions, or variations in metadata. To address this, the Clone Wars tool has been developed to systematically detect, flag and handle duplicates. It applies advanced matching algorithms to compare the metadata ensuring that duplicate records are found and removed without loss of information. 

Together, these services enable scalable, semantically harmonized workflows that deliver reproducible analytics and high-quality products, supporting policy-driven monitoring (MSFD, SDG 14) and global initiatives such as EDITO as EMODnet, EDITO and Copernicus.

How to cite: Reyes Suarez, N. C., Kooyman, R., Moncoiffe, G., Mieruch, S., Leroy, D., Giorgetti, A., Gatti, J., Iona, A. (., Racape, V., Fyrberg, L., French, M. A., Wesslander, K., and Vernet, M.: Blue-Cloud 2026 workbenches for Essential Ocean Variables: advancing harmonization and big-data workflows for eutrophication in marine science., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20251, https://doi.org/10.5194/egusphere-egu26-20251, 2026.

Data Terra is the French national research infrastructure dedicated to the observation, understanding and monitoring of the Earth system, with the explicit objective of transforming qualified Earth observation data into knowledge, services and indicators supporting scientific research, Earth system digital twins and public decision-making. It federates several long-standing thematic data and service hubs covering atmosphere, ocean, continental surfaces, solid Earth, biodiversity and provides high-resolution Earth observation. Together, these thematic poles curate, qualify and disseminate reference datasets, often based on long, homogeneous time series, essential to address major scientific challenges related to climate change, environmental dynamics and natural hazards.

A core objective of Data Terra is to foster interdisciplinarity through enhanced interoperability across disciplines, data types and scientific communities, a prerequisite for integrated Earth system science. This approach is strongly aligned with international frameworks of Essential Variables (Climate, Ocean, Land and Biodiversity EV), providing a shared scientific backbone for data production, qualification and reuse. The thematic poles structure their datasets, services and indicators around these Essential Variables, ensuring scientific consistency while enabling cross-domain analyses.

Data Terra relies on a federated, interoperable and scalable model, designed to be deployed and reused at different organisational and geographical scales. Its architecture and governance enable interoperability between national thematic poles, as well as integration with European and international initiatives, notably as a potential thematic or national node within the European Open Science Cloud (EOSC). This multi-scale design allows data, services and workflows developed within Data Terra to be exposed, combined and reused in broader research infrastructures without duplication or loss of semantic coherence.

Beyond core scientific use, Data Terra explicitly targets downstream applications such as Earth system digital twins, environmental services and decision-support tools for public policies. To strengthen the connection between scientific production, territorial needs and decision-makers, Data Terra has established regional and thematic coordination mechanisms (ART – Animations Régionales Thématiques). These ARTs act as interfaces between researchers, public authorities, private stakeholders and end-users, supporting the co-construction of indicators, dashboards and operational products adapted to policy and territorial contexts.

To support the full data-to-decision chain, Data Terra implements a coherent set of technical and semantic solutions. Semantic and machine-actionable interoperability is addressed through a pivot metadata model based on DCAT, combined with a shared repository of semantic artefacts, including controlled vocabularies, concept schemes and mappings. This enables automated discovery, cross-domain navigation and integration across platforms and infrastructures.

Technical interoperability relies on widely adopted standards and protocols, including OGC APIs for data discovery and access, S3-compatible object storage and cloud-optimised data formats such as ARCO. Emphasis is placed on the portability and reproducibility of data processing workflows, enabling execution across heterogeneous and federated computing environments.

Finally, Data Terra simplifies user interaction with complex and heterogeneous datasets to maximise scientific and societal impact. This is achieved through integrated resource catalogues linking datasets with example notebooks and documented use cases, advanced data preview and code generation capabilities, the federation of computing resources, and the development of dashboards and indicators.

How to cite: Bodéré, E., Rizzo, A., Ramage, K., and Detoc, J.: Data Terra: a federated research infrastructure transforming Earth system data into knowledge and services for science and public decision-making, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20567, https://doi.org/10.5194/egusphere-egu26-20567, 2026.

EGU26-20638 | Orals | ITS1.20/ESSI4.3

Interoperability of Argo Essential Climate Variables 

Delphine Dobler, Thierry Carval, Claire Gourcuff, and Yann-Hervé De Roeck

Argo is an international observation array of approximately 4 000 autonomous profiling floats measuring oceanic Essential Climate Variables (ECV) consisting of physical (pressure, temperature and salinity) and biogeochemical (dissolved oxygen, pH, nitrate, chlorophyll-a, downwelling irradiance and suspended particles) variables, from 2000-meter depth (or from 6000-meter depth for the deep floats) to the surface every 10 days, all over the ocean. More than 3 millions of vertical profiles have been collected in 25 years. 

The Argo array is unique as it samples the global ocean, even in regions or seasons when vessels cannot operate, and depths that satellite sensors cannot probe. Argo is tightly connected to other observation arrays, in calibration or cross-validation efforts, such as with the accurate measurements performed onboard research cruises, essential for Argo to achieve the accuracy required for climate studies, or satellite data. 

Argo contributes to monitor and understand climate change for several key climate change phenomena, including increase of the ocean heat content (and sea level rise), deoxygenation phenomenon, ocean acidification and carbon cycle. Because of its importance in science studies, including carbon cycle, the presentation will focus on the interoperability of the Argo dissolved oxygen data.

To facilitate science studies and support for public policies based on ECVs, FAIR share of both data and metadata is essential. The Argo international program has been continuously improving the findability, accessibility, interoperability and reusability of its dataset since its inception. Recently, Argo has increased its interoperability by exposing its vocabulary on the web, specifically on the NERC Vocabulary Server applying the I-ADOPT framework and thus facilitating the mapping of Argo vocabulary with the vocabulary of other research infrastructures. Argo metadata and data access services have also been improved to match evolving users’ needs. They have been made more accessible by being exposed under federative platforms, such as the ENVRI-Hub, currently developed under the ENVRI-Hub NEXT EU project, or on Galaxy Europe for a biogeochemical calibration collaborative workflow that has been developed under the FAIR-Ease EU project. Interoperability improvement activities are also currently undertaken within the AMRIT EU project for European marine research infrastructures datasets, including oxygen.  

FAIRness challenges faced by the Argo program are multiple, for instance the necessity to simplify the dataset for a given audience, which is done through the development of products (e.g. easyOneArgo) or the necessity to increase the interoperability of the measures’ conditions, including methods and uncertainties. 

Indeed, uncertainties are key information to climate change analyses: foreseeing that the sea level will rise by 10 meters +/- 5 cm has not the same meaning as foreseeing that the sea level will rise by 10 meters +/- 10 meters for a decision-maker. Argo uncertainties have the same dimension as the dataset itself (i.e., an error value is associated with each observation point), which means that uncertainties are to be considered as data. For the interoperability of essential climate variables, in science studies in general, and in predictive studies in particular, the FAIR share of uncertainties associated with ECVs is crucial.

How to cite: Dobler, D., Carval, T., Gourcuff, C., and De Roeck, Y.-H.: Interoperability of Argo Essential Climate Variables, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20638, https://doi.org/10.5194/egusphere-egu26-20638, 2026.

EGU26-21411 | Orals | ITS1.20/ESSI4.3

EGU ESSI–WMO–UNESCO Synergies for Interoperable Hydrological Data 

Stephan Dietrich, Philipp Saile, and Sylvain Grellet

Many promising initiatives are advancing FAIR data in hydrology, yet a substantial semantic and technical interoperability gap remains between water data used in operational services and in research, from in situ observations to model-derived products. At present, national hydrological services and the Earth system science community often develop data structures, vocabularies and workflows in parallel, which hampers seamless reuse of water information across mandates and scales. Addressing this fragmentation requires a collaborative effort to co-design shared semantic and ontological standards that can underpin interoperable data exchange for both operational water management and scientific analysis. This includes the FAIR Water community, including OGC Hydrology Domain Working Group (OGC HydroDWG), WMO, UN bodies (UNEP, UNESCO IGRAC and ICWRGC), DANUBIUS, eLTER RIs, TERENO and the Water4all partnership amongst others.

This contribution presents the state-of-the-art in and a conceptual and practical framework for connecting the Earth and Space Science Informatics community with the implementation of an emerging international hydrological data exchange standard that serves both operational hydrology and Earth system science. It aligns the objectives of the WMO Plan of Action for Hydrology – in particular the ambitions “high-quality data supports science” and “science provides a sound basis for operational hydrology” – with the development of WIS2-based hydrological data exchange under the WMO Task Team on WIS2 for Hydrology (TT‑W2FH), which is responsible for defining hydrology-specific topic hierarchies, metadata, KPIs and implementation guidance for the WMO Hydrological Observing System “WHOS” within the WMO Information System “WIS”. The contribution supports also the strategy process of the water program of the UNESCO “IHP‑IX” and deals with output 3.3 on validating open-access data on water quantity, quality and use, with a focus on workflows, quality control, and governance arrangements required to make such data reliably reusable in transboundary and global assessments.

The presentation discusses concrete pathways to embed FAIR digital object concepts, interoperable metadata and federated workflows from the ESSI community into the WMO and UNESCO implementation processes, thereby fostering cultural change towards open, standards-based data sharing. The definition of how to reach a high FAIRness level within the water community in the light of the existing international standards and best practices (OGC, W3C, INSPIRE, RDA) with the target to produce FAIR Implementation Profiles (FIP). By explicitly linking EGU ESSI user-centric research data infrastructures developments with WMO and UNESCO programmes, the contribution aims to strengthen international collaboration and to co-develop sustainable, community-driven practices for a hydrological data exchange standard that equally supports real-time operations, long-term water resources assessment and integrated Earth system modelling.

How to cite: Dietrich, S., Saile, P., and Grellet, S.: EGU ESSI–WMO–UNESCO Synergies for Interoperable Hydrological Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21411, https://doi.org/10.5194/egusphere-egu26-21411, 2026.

EGU26-21539 | Posters on site | ITS1.20/ESSI4.3

Data integration and semantic interoperability framework for the Svalbard Integrated Arctic Earth Observing System. 

Lara Ferrighi, Øystein Godøy, Luke Mardsen, Zoé Brasseur, and Daan Kivits

The Svalbard Integrated Arctic Earth Observing System (SIOS) is an international partnership of research institutions studying the environment and climate in and around Svalbard, with a dedicated Data Management System (DMS) Working Group organising data management activities. A core service of SIOS is its data catalogue, which aims to be the entry point to data discovery, visualisation and integration in the Svalbard region. This is only possible with a strong data management organization across partners and harmonisation of information from participating data centers. The central node in the SIOS DMS is harvesting information from these partner repositories through well established and standardized machine readable endpoints. SIOS, as an aggregator of metadata assets, is working on semantic interoperability and metadata enrichment to achieve a consistent and harmonized catalogue that can be used not only by researchers and decision-making bodies, but also integrated in data-driven arctic, polar, european and global initiatives (e.g. SAON Data Portal, WMO GCW, POLARIN, Arctic PASSION, ENVRI, EOSC).

A dedicated effort has been put to establish a list of essential Earth System Science (ESS) variables relevant to determine environmental change in the Arctic, through the SIOS Core Data (SCD) initiative - time series of data with at least a 5 year commitment. SCD are long lasting observing capabilities by SIOS partners. 

Through the support of data publication guidelines, brokering activities, FAIR data and vocabularies and consistent semantic relations, SIOS is aiming to continuously improve interoperability within and across relevant domains.

How to cite: Ferrighi, L., Godøy, Ø., Mardsen, L., Brasseur, Z., and Kivits, D.: Data integration and semantic interoperability framework for the Svalbard Integrated Arctic Earth Observing System., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21539, https://doi.org/10.5194/egusphere-egu26-21539, 2026.

EGU26-22942 | Posters on site | ITS1.20/ESSI4.3

Closing the climate services skills gap in Ukraine through competency-based education  

Hanna Lappalainen, Svyatoslav Tyuryakov, Enric Aguilar, Jon Xavier Olano Pozo, Alexander Mahura, Inna Khomenko, Tetiana Dyman, Myroslav Malovanyy, Valeriya  Ovcharuk, Kostiantyn  Talalaiev, Tetiana Tkachenko, and Yuriy Vergeles

The effective development and use of climate services depend on specialists who possess not only 
scientific knowledge but also clearly defined, practice-oriented competencies that enable the 
transformation of climate data into actionable information for decision-making. In Ukraine, 
climate services remain at an early stage of institutional development, and a persistent skills gap 
exists between climate information providers and users, particularly in climate-sensitive economic 
sectors and public administration. 
The Erasmus+ project “Multilevel Local, Nation- and Regionwide Education and Training in 
Climate Services, Climate Change Adaptation and Mitigation” (ClimEd; 2020–2026; 
http://climed.network) addresses this challenge by implementing a competency-based approach to 
climate education across multiple levels of learning. Rather than focusing on isolated training 
activities, the project establishes an integrated education pathway that links postgraduate 
education, professional development, and public climate literacy. 
At the academic level, ClimEd has developed PhD and Master’s programmes in Climate Services, 
alongside a Master’s programme in Climate Change Adaptation and Mitigation. These 
programmes emphasise competencies related to climate data management, climate model 
interpretation, climate product development, sectoral application of climate information, and 
climate communication. In parallel, targeted professional development programmes support 
decision-makers and practitioners in sectors such as agriculture, healthcare, urban management, 
water resources, energy, and construction. Massive open online courses further extend climate 
literacy to broader audiences. 
Course content and competency profiles are informed by a structured needs assessment involving 
297 stakeholders from climate-dependent sectors and 48 climate service providers, ensuring that 
identified skills gaps are translated into concrete learning outcomes and assessment criteria. 
Teaching and learning approaches prioritise applied learning through project-based, case-based, 
inquiry-based, and experiential methods, supported by blended and online delivery formats. 
Common quality principles ensure consistency, accessibility, and alignment between 
competencies, learning activities, and assessment across institutions  
By systematically embedding required competencies into curricula and training programmes at 
different qualification levels, ClimEd provides a concrete mechanism for reducing the climate 
services skills gap in Ukraine. The project demonstrates how competency-based education can 
strengthen human capacity, improve the usability of climate information, and enhance the 
integration of climate services into sectoral decision-making, offering a model applicable beyond 
the Ukrainian context. 

How to cite: Lappalainen, H., Tyuryakov, S., Aguilar, E., Olano Pozo, J. X., Mahura, A., Khomenko, I., Dyman, T., Malovanyy, M., Ovcharuk, V., Talalaiev, K., Tkachenko, T., and Vergeles, Y.: Closing the climate services skills gap in Ukraine through competency-based education , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22942, https://doi.org/10.5194/egusphere-egu26-22942, 2026.

EGU26-2821 | Posters on site | ITS1.21/ESSI4.5

Environmental geoscience research at the Geological Survey of Canada. 

Gilles Cotteret and Sabrina Bourgeois

For almost two decades, the Geological Survey of Canada has been working to understand the effects of geological resource development on the environment. This research is supported by the Environmental Geoscience Program. The goal of this program is to provide leading-edge scientific information to differentiate the effects of natural resource development on the environment from those of natural processes. The development of new geoscientific approaches serves to support the responsible development and use of Canada's natural resources through informed decision-making.

In this presentation, we will take a brief look back at key past activities and focus on the series of new projects that have begun in 2024.

From 2019 to 2024, the program conducted some 15 projects in the following five themes: Baseline Characterization, Cumulative Effects, Deep Environments, Emerging Issues and Biosphere, Hydrosphere, Atmosphere. The range of projects included, among others, induced seismicity, oil sands, geological carbon sequestration and global mercury assessment with UNEP.

In its current phase (2024-2029), the program comprises a series of twelve projects, divided into 4 themes: impact assessment, regional assessments, processes and characterization. Current projects include topics as varied as the national integration of groundwater knowledge, the use of clumped isotopes to characterize nuclear waste disposal sites, the study of metals in the environment of active metalliferous regions, or the study of aquifer contamination by legacy oil and gas wells on indigenous lands.

The vastness of the Canadian territory, combined with a resource-rich subsoil, provides the opportunity to carry out a multitude of geoenvironmental projects in support of sound environmental stewardship for the benefit of local communities.

How to cite: Cotteret, G. and Bourgeois, S.: Environmental geoscience research at the Geological Survey of Canada., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2821, https://doi.org/10.5194/egusphere-egu26-2821, 2026.

EGU26-3919 | Orals | ITS1.21/ESSI4.5

SmartLake: A smart datalake for short and long tail data types 

Jens Turowski, Gunnar Pruß, Christian Erikson, Tobias Jaeuthe, and Hui Tang

The geosciences are a data-heavy discipline, and a wide range of data types and formats are commonly used, even within the same sub-discipline or working group. For example, in hydrology or geomorphology, geospatial data (e.g., satellite imagery, maps, sample locations) are routinely paired with time-series data (e.g., discharge or precipitation monitoring) and laboratory-derived data from individual samples (e.g., isotope chemistry from water samples). For some data types, widely-used community standards exist (e.g., seismic or satellite remote sensing data), stipulating data formats, file types, and relevant metadata. These are known as short-tail data types. Yet, for many data types, either such standards do not exist at all, or several competing standards are used in parallel. These are known as long-tail data types. As a result, research and monitoring data are often not managed and archived according to the FAIR principles or even get lost as researchers move between positions. Yet, many funding agencies require a data management plan and a commitment to open data principles already at the proposal stage. We require a flexible digital infrastructure for data management, that (1) can handle the entire data management chain from upload to publication, (2) is modular and scalable in the sense that it can be set up for individual projects, a workgroup or unit, or entire institutes, (3) is customizable in the sense that it can be set up for different types of data, environments, and tasks, (4) allows for the automation of data management tasks, and (5) can associate rich metadata with individual data files. Here, we introduce SmartLake, a datalake application that integrates a storage environment with a modular metadata catalog and a workflow engine. We describe the concept and architecture of SmartLake, and demonstrate that it can handle a broad range of data management tasks in a flexible way. The workflow engine allows the integration of customizable workflows to retrieve data and metadata, perform quality checks, file type conversions, and standard analysis, transform the data into a form necessary for machine learning, and generate data publications. Once set up, SmartLake can, in principle, automatically handle the entire data management pipeline, thereby minimizing the efforts required for data management, metadata enrichment, archiving, and publication.

How to cite: Turowski, J., Pruß, G., Erikson, C., Jaeuthe, T., and Tang, H.: SmartLake: A smart datalake for short and long tail data types, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3919, https://doi.org/10.5194/egusphere-egu26-3919, 2026.

EGU26-5938 | Orals | ITS1.21/ESSI4.5

Advancing community workflows, interdisciplinary collaboration, communication, and trust in field based geologic data systems 

Basil Tikoff, Julie Newman, Thomas F. Shipley, Ellen M. Nelson, Drew Davidson, J. Douglas Walker, Bailey K. Srimoungchanh, Sarah F. Trevino, Cristina Wilson, Claire Martin, Christine Regalla, Cailey Condit, and Nick Roberts

StraboField – part of the StraboSpot digital data system – allows researchers to share primary field data and observations, provide a context for sampling, and plot geological maps.  This presentation details recent developments within StraboField to facilitate multi-disciplinary studies and increase trust in digital data system.  On the basis of community feedback, we have recently introduced Documents, of which there are three types: Outcrop Summaries, Memos, and Models.  All of these Documents are designed to establish trust in the digital data, by establishing why a particular decision was made.  Outcrop summaries put uncertainty evaluation in the workflow of a field-based geologist, and allow the researcher to designate a Critical Outcrop.  There are four different types of critical outcrops: Exemplar, Confuser, Disambiguator, and Anchor.  Further, geologists can report analogous features observed elsewhere in the world that are guiding their interpretation.  Memos consist of five types: 1) Idea; 2) Plan; 3) Question; 4) Summary; and 5) Other (User defined).  Users can specify an intended audience for each report: Anyone, Collaborators, or Individual scientist.  Memos both facilitate collaborative work on the same project and enhance communication between practitioners with different expertise, working on similar projects.  Models allow geologists to describe one or multiple models, so that future observations can be tested against these models.  Memos and Models enable users to link spots together and to add additional context through notes, photos, sketches, and tags.  By including this information in digital data systems, future practitioners working with these datasets will have a clear understanding of how the data were collected and where there may be gaps worth researching. Documents are designed to emphasize and summarize important observations and connections in a field area to aid collaborators or other practitioners.  Critically, Documents retain a temporal ordering that records the development of a particular idea or model throughout a field project.  

How to cite: Tikoff, B., Newman, J., Shipley, T. F., Nelson, E. M., Davidson, D., Walker, J. D., Srimoungchanh, B. K., Trevino, S. F., Wilson, C., Martin, C., Regalla, C., Condit, C., and Roberts, N.: Advancing community workflows, interdisciplinary collaboration, communication, and trust in field based geologic data systems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5938, https://doi.org/10.5194/egusphere-egu26-5938, 2026.

EGU26-6823 | ECS | Posters on site | ITS1.21/ESSI4.5

From static hazard maps to an interactive multi-hazard geospatial analysis platform enabling collaborative digital workflows 

Alessio Patanè, Laura Sandri, Danilo Reitano, Letizia Spampinato, and Giuseppe Puglisi

The increasing availability of probabilistic hazard datasets in solid Earth Sciences requires digital environments that go beyond static map visualization, enabling in-depth spatial analysis, data comparison across multiple hazard contexts, and data download in standard formats.
In this work, we present a web-based geospatial platform properly designed to support interactive exploration, analysis, and dissemination of hazard data (maps and curves), while remaining extensible to any type of GIS layer. The platform performance is tested using Mount Etna as a case study and integrates volcanic and seismic hazard assessments derived from established probabilistic models for different hazardous events and their metrics, including lava flow invasion, ground load from volcanic ash fallout, and seismic intensity (Cappello et al, 2025; Scollo et al, 2025; D’amico et al, 2025). Hazard datasets originally provided in NetCDF format are here processed and stored in a spatial database, allowing consistent management of both raster and vector representations of exceedance probabilities across different spatial resolutions. Aside from the standard spatial queries, the system enables advanced analytical interactions, such as point-based interrogation of hazard layers with on-the-fly visualization of probability percentiles across different hazardous events and specifically different thresholds in their metric. Users can also extract and download hazard matrices and map products, supporting quantitative comparison and further offline analysis. By combining geospatial data management with interactive analytical tools, the platform allows researchers from different disciplines to explore complex spatial information in a transparent and reproducible manner. The adoption of standardized web services and modular workflows enhances interoperability and facilitates integration with external infrastructures. Designed in accordance with FAIR data principles, the platform represents a flexible digital geoscience tool that can be extended to additional hazard domains and GIS workspaces. This work proves how interactive geospatial analysis workflows can enhance the scientific use of probabilistic hazard information and foster collaboration among hazard modelers, Earth scientists, and stakeholders, in line with the objectives of the EPOS European research infrastructure.

How to cite: Patanè, A., Sandri, L., Reitano, D., Spampinato, L., and Puglisi, G.: From static hazard maps to an interactive multi-hazard geospatial analysis platform enabling collaborative digital workflows, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6823, https://doi.org/10.5194/egusphere-egu26-6823, 2026.

EGU26-7667 | Orals | ITS1.21/ESSI4.5

Jupyter Notebooks as a learning tool in European Plate Observing System (EPOS) for multidisciplinary research 

Jan Michálek, Kety Giuliacci, Valerio Vinciarelli, Rossana Paciello, Daniele Bailo, Teuno Hooijer, Ian van der Neut, and Jean-Baptiste Roquencourt and the EPOS Team (IT developers and Jupyter Notebook contributors)

The European Plate Observing System (EPOS) addresses the problem of homogeneous access to heterogeneous distributed digital assets in geoscience within Europe, following the FAIR principles. EPOS has been a European Research Infrastructure Consortium (ERIC) since 2018, with the goal of building long-term and sustainable infrastructure for solid Earth science. The EPOS Platform was launched into the operational phase in April 2023 and is introducing new ways for cross-disciplinary research, especially for data discovery. Currently, the EPOS Platform, a metadata and semantic-driven system for integrating Data, Software and services, provides access to data and data products from ten different geoscientific areas: Seismology, Near Fault Observatories, GNSS Data and Products, Volcano Observations, Satellite Data, Geomagnetic Observations, Anthropogenic Hazards, Geological Information and Modelling, Multi-scale laboratories and Tsunami Research. 

This presentation details the integration of Jupyter Notebooks into the EPOS platform. EPOS is using SWIRRL API allowing the deployment of Jupyter notebooks to distributed computing facilities. This implementation enables users to perform advanced processing of datasets directly within the Virtual Research Environment (VRE) of the EPOS ecosystem. We showcase multidisciplinary use cases provided by researchers from various domains that demonstrate efficient data processing workflows and visualizations using EPOS services. Furthermore, we position Jupyter Notebooks as dynamic learning tools; they combine methodological descriptions with executable code that users can modify for specific needs. By leveraging parameterized queries to EPOS web services, users can easily customize data retrieval and facilitate reproducibility by sharing workspace snapshots via GitHub. Examples of Jupyter Notebooks aim to help young researchers to understand typical data processing in individual domains, such as earthquakes and seismic hazard, volcanic eruptions, geomagnetic storms, anthropogenic hazards and many more. At the same time, it can assist experienced researchers to foster cross-disciplinary research.

How to cite: Michálek, J., Giuliacci, K., Vinciarelli, V., Paciello, R., Bailo, D., Hooijer, T., van der Neut, I., and Roquencourt, J.-B. and the EPOS Team (IT developers and Jupyter Notebook contributors): Jupyter Notebooks as a learning tool in European Plate Observing System (EPOS) for multidisciplinary research, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7667, https://doi.org/10.5194/egusphere-egu26-7667, 2026.

Compared to other continents, Antarctica suffers from a poverty of geophysical data that would allow for a better understanding of its geological structure, as well as isostatic response of the continent to the  volumetric change of the ice cover. Antarctica, and in particular its rocky unhabituated oases, could be utilized for the installation of geophysical, autonomous devices that would measure and record unique seismic (both volumetric and surface) waves, gravimetric, geomagnetic (including magnetotelluric) or ionospheric data, not handicapped by an impact from anthropogenic sources. Such data could significantly contribute to our understanding of the Earth's internal structure, from the core, through the structure of the mantle and crust, to the dynamics of glaciers as well as ionospheric processes that are related to space weather effects.

An example is the rocky oasis of Bunger Hills, located in the Australian part of the Southern Ocean and several dozen kilometres away from the Ocean. During the IVth Polish Antarctic Research Expedition to the Antoni B. Dobrowolski Station (located in the central part of the oasis), test geophysical measurements in the fields of seismology, meteorology, geomagnetism, and ionosphere research were carried out in the summer of 2021/2022. The results obtained are of high quality and clearly indicate the potential of the Dobrowolski Station for the location of autonomous and automatic geophysical stations providing measurement data to global databases. Since the Station is equipped with a concrete pole, built in 1958/59 for gravimetric measurements (see: https://www.ats.aq/devph/en/apa-database/126), it also could be used for isostatic movements of the Antarctic crust as the ice cover recedes.

Given expansion of EPOS ERIC beyond the continental Europe, the Dobrowolski Station would be a strong node in the Antarctic geophysical infrastructure network, providing high-quality recordings to topical data exchange platforms (Thematic Core Services - TCS) within EPOS ERIC, as well as to other global data centers.

 

How to cite: Lewandowski, M., Miloch, W., and Nawrot, A.: Potential of the Polish Antarctic Station Dobrowolski for international cooperation within the framework of EPOS ERIC, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7864, https://doi.org/10.5194/egusphere-egu26-7864, 2026.

The availability of open access, petabyte-scale geophysical data creates new cross-domain analytical capabilities and challenges. Meeting the challenges of working with massive cross-domain data stores requires assessing existing methodologies and reworking them for an on-demand, distributed computing environment. This presentation examines existing data management and computing practices and introduces a framework for scientific cloud computing for the geosciences. Starting with cloud storage, the framework examines effective ways to leverage computing resources, including containers, serverless, and databases. In addition to addressing computing infrastructure, the framework also supports the use of computationally efficient software libraries that can parallelize workflows and leverage machine learning and artificial intelligence.

How to cite: Parafina, S.: Cloud Infrastructure and Methodologies for GeoSciences: From Contrainers to Machine Learning and Artificial Intelligence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8086, https://doi.org/10.5194/egusphere-egu26-8086, 2026.

Open Science initiatives have made enormous volumes of climate data freely available to researchers worldwide. Yet accessing this wealth of information remains challenging for many scientists. Global Climate Model outputs and reanalysis datasets typically come in multidimensional NetCDF formats that require programming expertise to analyze effectively. Civil engineers, geologists, urban planners, and other domain specialists often find themselves unable to work with this data independently, despite having the scientific knowledge to extract meaningful insights from it.

Existing command-line tools are undeniably powerful. They can perform sophisticated analyses, but their reliance on complex syntax creates substantial barriers for researchers without programming backgrounds. A problematic gap has emerged between those who can manipulate the data computationally and those who understand its real-world implications. Collaborative research suffers when domain experts cannot readily integrate climate information into their own disciplinary work.

NCexplorer was developed to address this accessibility challenge. The software provides a visual, point-and-click interface that makes climate data analysis feasible for researchers regardless of programming experience. Tasks that previously required scripting knowledge can now be accomplished through organized menus and drag-and-drop workflows. Users can explore datasets, perform statistical calculations, and generate spatial visualizations without writing code.

Practical usability guided the design from the start. Researchers can load NetCDF files, define analysis regions, compute various statistics, and examine results on maps within a single application. Initial deployment has shown promising results. The software successfully handles common analytical tasks including extracting temperature trends for specific locations, calculating climate indices, and comparing multiple datasets. Cross-platform compatibility ensures the tool works across Windows, macOS, and Linux environments typically found in research institutions.

Several preliminary applications have emerged from early testing. A civil engineer analyzed decades of precipitation patterns to inform infrastructure planning, working entirely through the graphical interface. Another study combined climate model outputs with ground observations for regional validation work. These examples suggest potential applications spanning urban climate assessment, environmental impact studies, and integrated analyses that combine atmospheric data with other geoscience disciplines.

The broader contribution lies in making analytical sophistication accessible to researchers focused on scientific questions rather than computational mechanics. Development continues with plans to expand the range of available analytical operators and create domain-focused documentation. Accessible tools become increasingly important as climate data grows more central to geoscience research across disciplines.

How to cite: Shivach, M. and Dubey, S.: Development of Cross-platform Graphical Interface Software for Climate Data Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8849, https://doi.org/10.5194/egusphere-egu26-8849, 2026.

EGU26-10092 | ECS | Posters on site | ITS1.21/ESSI4.5

FAIR GNSS for collaborative Solid Earth science in Iceland: metadata, validation workflows and EPOS dissemination 

Hildur M. Fridriksdóttir, Benedikt G. Ófeigsson, Dalia Prizginiene, Halldór Geirsson, Gudbjartur H. Kristinsson, Nadia K. Kompatscher, Kristín Vogfjord, and Ríkey Júlíusdóttir

Digital Solid Earth science relies on GNSS data services that are interoperable, traceable and reusable across institutions. In practice, GNSS station metadata and RINEX files are highly sensitive to manual handling and heterogeneous conventions. Inconsistencies in equipment histories, identifiers and file conventions can delay integration, reduce trust in downstream products, and hinder open dissemination. These challenges are amplified when coordinating across multiple data owners and legacy archives. 

We present an EPOS-aligned workflow for Icelandic GNSS metadata curation and service implementation developed across IMO, NordVulk and NSII. The work establishes a production-grade metadata and data integration pathway using a central integration layer coupled to EPOS services. Icelandic GNSS data are currently exposed via EPOS VOLC-TCS, while integration with the EPOS GNSS Thematic Core Service (GNSS-TCS) via GLASS (Geodetic Linkage Advanced Software System) is in the final stages of implementation to enable dissemination via the EPOS Data Portal. A key focus is reducing manual intervention and inconsistency while retaining necessary expert review. We describe automated validation and correction steps implemented in custom tooling (Tostools) developed in-house to streamline metadata curation and RINEX compliance, including consistency checks between station metadata, equipment change histories and RINEX content, generation of infrastructure-ready site logs for the M3G metadata service (the GNSS station site log standard used in EPOS/GLASS), and near-automated preparation of DOMES (Directory of MERIT Sites) identifier applications. The workflow also supports staged integration of heterogeneous datasets, including rescue and documentation of legacy University of Iceland campaign measurements and controlled dissemination based on data ownership constraints. 

As a motivating context, we refer to recent Reykjanes Peninsula deformation studies where dense GNSS observations were central to resolving rapid intrusive processes alongside other datasets, illustrating the value of reliable, well-documented and shareable geodetic data services (Sigmundsson et al., 2024). 

We conclude with practical lessons and recommendations for implementing FAIR and open-science aligned, infrastructure-ready GNSS services in a way that improves efficiency, reduces misunderstandings and accelerates collaboration. 

Reference: Sigmundsson, F., et al. (2024). Fracturing and tectonic stress drive ultrarapid magma flow into dikes. Science, 383, 1228–1235. https://doi.org/10.1126/science.adn2838. 

How to cite: Fridriksdóttir, H. M., Ófeigsson, B. G., Prizginiene, D., Geirsson, H., Kristinsson, G. H., Kompatscher, N. K., Vogfjord, K., and Júlíusdóttir, R.: FAIR GNSS for collaborative Solid Earth science in Iceland: metadata, validation workflows and EPOS dissemination, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10092, https://doi.org/10.5194/egusphere-egu26-10092, 2026.

EGU26-11410 | Posters on site | ITS1.21/ESSI4.5

Climate-driven surface mass loading and stress modulation in global subduction zones 

Yiting Cai, Roland Bürgmann, and Karine Le Bail

Surface mass redistribution driven by hydrological, oceanic, and atmospheric processes produces time-varying loads on the solid Earth, generating stress perturbations that may influence seismicity. Quantifying how these surface processes interact with tectonic stress accumulation in subduction zones, where the largest earthquakes and associated cascading hazards occur, requires an interdisciplinary integration of Earth system and solid Earth observations, yet remains insufficiently understood. The periodic nature of surface mass loading provides a natural probe of fault sensitivity to modest stress perturbations, enabling the detection of spatially coherent and seasonally varying stress modulation patterns across major subduction zones. Here, we present a global, data-driven framework that integrates GRACE/GRACE-FO satellite gravimetry–derived mass variations, global earthquake focal-mechanism catalogs, and tectonic stress models to investigate how time-dependent surface loads modify fault stress states across subduction margins in the upper 50 km and near the seismogenic plate interface. Using openly available and independently curated datasets within a high-performance computing framework, we compute load-induced stress perturbations at depth and evaluate their orientations relative to the prevailing tectonic stress field to identify conditions under which surface-driven stresses may promote or inhibit fault failure. Our results reveal systematic spatial and temporal patterns linking climate-driven surface processes with megathrust and upper-plate fault behavior, while demonstrating that the seismic response to loading is strongly controlled by tectonic setting. This study also highlights both the opportunities and challenges of interdisciplinary research based on heterogeneous open datasets.

How to cite: Cai, Y., Bürgmann, R., and Le Bail, K.: Climate-driven surface mass loading and stress modulation in global subduction zones, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11410, https://doi.org/10.5194/egusphere-egu26-11410, 2026.

EGU26-14727 | Posters on site | ITS1.21/ESSI4.5

SEISMO-VRE: a tool to perform an automatic multiparametric investigation of earthquake, volcano eruption and other natural or artificial hazards 

Dedalo Marchetti, Daniele Bailo, Giuseppe Falcone, Jan Michalek, Rossana Paciello, and Alessandro Piscini

The study of the preparation phase of earthquake occurrence is essential to better understand our planet and assess the seismic hazard (Marchetti et al., 2024). However, different approaches are generally provided without the possibility of a direct comparison. Consequently, even the nature of the extracted anomalies before an earthquake is the object of scientific discussion.

In this work, we present SEISMO-VRE, a freely available tool available on GitHub (https://github.com/dedalomarchetti/SEISMO-VRE), developed as Jupyter Notebooks with Python or MATLAB kernels, to serve a broader user community following the open science paradigm. The tool performs several analyses of the lithosphere, atmosphere and ionosphere, extracting anomalies and trends for each geo-layer. It produces graphs and tables of the extracted anomalies and, in particular, a summary graph for the comparison of the trends in the lithosphere, atmosphere and ionosphere.  suggesting potential interactions between Earth’s geo-layers.

Data are retrieved from the European Plate Observing System (EPOS) Platform and integrated with NASA and ESA atmospheric and ionospheric data using a dedicated notebook that we provided in the same public GitHub repository.

Given the ease of reproducibility of SEISMO-VRE across different case studies, we will present results in response to earthquakes and volcanic eruptions. Case studies will include the Italian Seismic sequence 2016 (M6.0 and M6.5 as larger events), the Turkey Marmara region 23 April 2025 M6.2 earthquake, the Etna volcano eruption on 3 December 2015 (Volcanic Explosive Index, VEI = 3) and other cases.

Whenever it’s possible, we will also compare the results of the SEISMO-VRE with published papers to discuss the similarities and differences. Overall, we will provide a scientific discussion on the possible reasons for differences in the identified trends for different earthquakes. In fact, epicentre location (sea or land), focal mechanism, and magnitude appear to play a major role in the preparation phase of earthquakes.

Finally, we propose this tool not only to provide a universal and shared framework for the multiparametric investigation of earthquakes, volcanic eruptions and other natural and anthropogenic hazards, but also to highlight the advantages of using the EPOS pan-European research Infrastructure platform.

 

References:

  • Marchetti, D., Bailo, D., Falcone, G., Michalek, J., Paciello, R., & Piscini, A. (2025a). GitHub. https://github.com/dedalomarchetti/SEISMO-VRE
  • Marchetti, D.; Bailo, D.; Michalek, J.; Paciello, R.; Falcone, G.; Piscini, A (2025b). A Multiparametric Investigation of an Earthquake by a Jupyter Notebook: The Case Study of the Amatrice-Norcia Italian Seismic Sequence 2016-2017. In Proceedings of the Computational Science and Its Applications – ICCSA 2025 Workshops; https://doi.org/10.1007/978-3-031-97657-5_19
  • Marchetti, D.; Yuan, Y.; Zhu, K (2024. Editorial of Special Issue “Remote Sensing Observations to Improve Knowledge of Lithosphere–Atmosphere–Ionosphere Coupling during the Preparatory Phase of Earthquakes”. Remote Sens. 2024, 16, 1064. https://doi.org/10.3390/rs16061064

How to cite: Marchetti, D., Bailo, D., Falcone, G., Michalek, J., Paciello, R., and Piscini, A.: SEISMO-VRE: a tool to perform an automatic multiparametric investigation of earthquake, volcano eruption and other natural or artificial hazards, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14727, https://doi.org/10.5194/egusphere-egu26-14727, 2026.

EGU26-16433 | Posters on site | ITS1.21/ESSI4.5

Acquiring, sharing and exploiting long-period and multi-parametric instrumental data for landslide science: the French EPOS-France vision, and its implementation at the European level. 

Jean-Philippe Malet, Séverine Bernardie, Catherine Bertrand, Muriel Gasc, Stéphanie Gautier-Raux, Clément Hibert, Pascal Lacroix, Thomas Lebourg, Mathilde Radiguet, and Maurin Vidal

Documenting landslide activity over long periods using monitoring standards (sensors, acquisition rates, quality-control) is critical for understanding landslide forcing factors, validating process-based models, identifying the effect of climate change on their behavior, and ultimately defining warning thresholds. 

These goals underline the mission of the Thematic Group “Landslides” currently being set up among several French institutes (CNRS, BRGM, CEREMA, IRD) within the national Solid Earth Research Infrastructure EPOS-France.

The thematic group has two objectives organized in two Specific Actions (SAs): 

  • SA#1 - setting up a permanent observatory of continuously moving large landslides using a multi-instrumented approach
  • SA#2 – setting up an observatory of landslide hot moments, corresponding to monitoring campaigns of specific landslide acceleration periods in order to learn from such specific extreme events.

The two SAs are built upon the sensor technologies and information system of the French Landslide Observatory (Observatoire Multi-Disciplinaire des Instabilités de Versants) which is a service of the French Research Institute (CNRS) in charge of deploying, acquiring, exploiting and disseminating multi-parametric sensor data over several large landslides. OMIV has developed, since nearly 20 years, standards in terms of sensor types, using both high-grade and low-cost sensing in order to construct reference and spatially dense monitoring time series. The service provides open access to records of landslide kinematics, landslide micro-seismicity, landslide hydro-meteorology and landslide hydro-geophysics. Combined, these categories of observations are unique worldwide for long-term landslide observations. OMIV is currently supervising the acquisition and dissemination of sensor data on 8 permanent unstable slopes (Avignonet/Harmallière, La Clapière, Séchilienne, Super-Sauze/La Valette, St-Eynard, Pégairolles, Vence, Villerville) and on unstable slopes currently experiencing gravitational crises (Viella, Marie-sur-Tinée, Aiguilles). The service is organized around the dissemination of qualified data (in international reference file format) and products for 5 categories of observation (Geodesy, Seismology, Hydrology, Meteorology, Hydrogeophysics). For each category of observation, specific FAIR data repositories and access portals are available and automated processing methods have been proposed to meet the needs of the landslide research community. The products being generated are time series of GNSS and total station positions, catalogue of endogenous landslide micro-seismicity, resistivity tomography datasets, and hydro-meteorological parameters. These observations aim at contributing at identifying the key controlling parameters of different landslide types (e.g. soft/hard rock, cohesion/friction, slip/fracture, localized/diffuse damage) and at monitoring their evolution in time and space (deceleration or acceleration according to the triggering factors, sliding- flowing transition).

The objective is to present the strategy of data acquisition, qualification, dissemination and exploitation of the EPOS-France landslide thematic group, and discuss ideas to set up (at mid-term) a European landslide thematic core service (landslide-TCS) within EPOS ERIC.

How to cite: Malet, J.-P., Bernardie, S., Bertrand, C., Gasc, M., Gautier-Raux, S., Hibert, C., Lacroix, P., Lebourg, T., Radiguet, M., and Vidal, M.: Acquiring, sharing and exploiting long-period and multi-parametric instrumental data for landslide science: the French EPOS-France vision, and its implementation at the European level., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16433, https://doi.org/10.5194/egusphere-egu26-16433, 2026.

EGU26-17928 | ECS | Orals | ITS1.21/ESSI4.5

From PTHA to Probability of Safe Evacuation: An Agent-Based Modelling (ABM) Use Case for EPOS Integrated Core Services - Distributed 

Saeed Soltani, Fatemeh Jalayer, Stefano Lorito, Manuela Volpe, Julie Dugdale, Hossein Ebrahimian, Saman Ghaffarian, and Alice Abbate

This work introduces a methodology that links Probabilistic Tsunami Hazard Analysis (PTHA) with social-behavioral simulation to support risk-informed decision-making, where safe evacuation probabilities are evaluated by integrating hazard likelihood with agent-based estimates of evacuation success or failure. The proposed model integrates heterogeneous digital assets, including probabilistic hazard scenarios, exposure datasets and statistically derived behavioral parameters, within a unified workflow, with several inputs already available or foreseen through the EPOS data portal. The methodological core follows a Probabilistic Tsunami Risk Assessment (PTRA) formulation and is embedded within a Monte Carlo integration scheme, where evacuation metrics are derived from repeated simulations under random realizations of uncertain inputs.

From a digital infrastructure perspective, this work explores the set-up and requirements for developing a prototype model as a use case for the EPOS Integrated Core Services-Distributed (ICS-D). Execution of PTRA using ABM requires access to distributed computing resources to support large scale and large numbers of agent-based simulations, server-side storage for hazard scenarios and exposure datasets, and statistical analysis/machine learning tools for the semi-automated transformation of raw information into meaningful behavioral inputs. Simulation outputs and geospatial products are then generated through Python and GIS-compatible visualization workflows.

Looking ahead, such configuration offers a basis for a probabilistic tsunami evacuation modelling service capable of incorporating cascading multi-hazard effects, such as earthquake-induced damage that directly influences evacuation dynamics. By explicitly representing individuals, infrastructure, and their interactions within a shared environment, it enables iterative updating of time dependent conditions. This provides a natural pathway toward a Digital Twin–oriented framework for tsunami evacuation, supporting adaptive, decision-relevant risk analysis and align with the objectives of the Tsunami Thematic Core Service of EPOS.

How to cite: Soltani, S., Jalayer, F., Lorito, S., Volpe, M., Dugdale, J., Ebrahimian, H., Ghaffarian, S., and Abbate, A.: From PTHA to Probability of Safe Evacuation: An Agent-Based Modelling (ABM) Use Case for EPOS Integrated Core Services - Distributed, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17928, https://doi.org/10.5194/egusphere-egu26-17928, 2026.

EGU26-18235 | ECS | Posters on site | ITS1.21/ESSI4.5

Earthquake Preparatory Process in pull-apart systems: reservoir-triggered seismicity in Castanhão, NE Brazil 

Helena Ciechowska, Beata Orlecka-Sikora, Łukasz Rudziński, Aderson do Nascimento, José Fonseca, and Alessandro Vuan

The manmade changes to the environment can pose a risk of triggering seismic activity within the regions affected by such alterations. The seismic activity can be triggered by many factors, such as underground mining, hydrocarbon extraction, CO2 sequestration, wastewater injection, etc. One of the impacting factors is reservoir impoundment related to artificially created bodies of water. 

The artificial reservoirs play a significant role in the modern environment. They are built for the purpose of flood prevention, water storage, irrigation, hydropower, etc. Building the dam over rivers, however, can pose a risk to local societies in case of its damage. Such a case took place in 1967 and was related to the Koyna earthquake, which was triggered by reservoir impoundment, causing over 200 fatalities and leaving a few thousand people injured. 

In the following study, we investigate the Reservoir-Triggered Seismicity (RTS) within the biggest artificial lake – Castanhão Reservoir – in the State of Ceará, NE Brazil. For the purpose of study, we employ tools available within the EPOS EPISODES Platform, PyMPA Template Matching package, Hypo71, HypoDD, and KIWITool. We observed 227 earthquakes on the study site, between December 2009 and August 2008, whose moment magnitudes vary between 0.0 and 2.7. We also investigate the role of pore pressure variation in triggering earthquakes within the reservoir. 

Our results show that changes in pore pressure in the underlying medium can cause the swarm-like seismicity within the vicinity of the reservoir located within pull-apart systems. We compare them to other known sites within similar tectonic settings. The study shows that the pull-apart basins are prone to reservoir-triggered seismicity and should be treated as such during seismic hazard assessment. Such an investigation requires a multidisciplinary approach within Solid Earth science disciplines such as seismology, geology, tectonics, as well as scientific branches such as hydrology and modelling. The result of this study is in preparation to be published as one of the new EPISODES on the EPOS Episodes Platform.

How to cite: Ciechowska, H., Orlecka-Sikora, B., Rudziński, Ł., do Nascimento, A., Fonseca, J., and Vuan, A.: Earthquake Preparatory Process in pull-apart systems: reservoir-triggered seismicity in Castanhão, NE Brazil, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18235, https://doi.org/10.5194/egusphere-egu26-18235, 2026.

EGU26-18909 | Orals | ITS1.21/ESSI4.5

A brief overview of the open and interoperable National Satellite Ground Motion Service for the Italian territory developed through the GeosciencesIR initiative 

Claudio De Luca, Manuela Bonano, Francesco Casu, Carlo Cipolloni, Maria Pia Congi, Barbara Dessì, Marco Gerardi, Luca Guerrieri, Riccardo Lanari, Gabriele Leoni, Michele Manunta, Francesco Menniti, Giovanni Onorato, Daniele Spizzichino, and Ivana Zinno

The GeoSciencesIR project, funded through the Italian PNRR initiative, aims at establishing a dedicated research infrastructure for the Italian Network of Geological Surveys (RISG), enhancing collaboration between national and regional geological services. Coordinated by ISPRA and involving universities and research institutions across Italy, GeoSciencesIR focuses on harmonizing geological information and services within a cloud-based infrastructure designed in agreement with FAIR principles and INSPIRE standards. This infrastructure seeks to improve access to interoperable data and analysis tools for end users and fosters sustained capacity building and knowledge exchange within the Earth science community. Within this framework, the Institute for Electromagnetic Sensing of the Environment of the Italian National Research Council (IREA-CNR) is leading the implementation of a national Satellite Ground Motion Service (SGMS) aimed at supporting Italian regional authorities, autonomous provinces and other institutional stakeholders.

This work is focused on presenting the SGMS, which has been designed to routinely generate ground displacement time series from the SAR images produced by the European Copernicus Sentinel-1 constellation (and, in the future, by other SAR missions like NISAR and ROSE-L). SGMS operates over the Italian territory with a three-month latency. By utilizing dedicated computing and storage resources, it achieves an update frequency for displacement time series three times higher than the European Ground Motion Service, while providing a spatial resolution of the final products of about 30 meters. Moreover, starting from the radar Line of Sight deformation measurements retrieved through the ascending and descending orbits SAR imaging, SGMS will provide displacement time series and mean velocity maps for the vertical and East-West deformation components. Moreover, all SGMS products are conceived to be openly accessible and fully compliant with the FAIR principles.

A key aspect of the SGMS is its strong complementarity and interoperability with European research infrastructures, particularly with the EPOS Satellite Data Thematic Core Service and its EPOSAR component. Both SGMS and EPOSAR are based on the P-SBAS DInSAR approach, ensuring methodological consistency and comparability of the derived deformation products. Within this framework, EPOSAR provides validated ground deformation products over selected areas of interest around the Earth supporting detailed scientific analyses, while SGMS delivers continuous and regular updates over the entire Italian territory, addressing operational and institutional monitoring needs at national and regional scales.

Furthermore, the adoption of FAIR principles and INSPIRE-compliant standards in the design of SGMS service, taking advantage of the EPOSAR experience, enables the full interoperability between the two services, allowing products to be shared, accessed and reused across platforms. This synergy not only enhances the overall informational content by enlarging the availability of satellite-derived deformation measurements and, in general, other geoscientific data, but also significantly broadens the user base, extending it from the scientific community to public bodies and national authorities.

 

This research was partially funded by HE EPOS-ON (GA 101131592) and the European Union-NextGeneratonEU through the GeoSciencesIR project – PNRR M4C2 Investimento 3.1 - IR00000037.

How to cite: De Luca, C., Bonano, M., Casu, F., Cipolloni, C., Congi, M. P., Dessì, B., Gerardi, M., Guerrieri, L., Lanari, R., Leoni, G., Manunta, M., Menniti, F., Onorato, G., Spizzichino, D., and Zinno, I.: A brief overview of the open and interoperable National Satellite Ground Motion Service for the Italian territory developed through the GeosciencesIR initiative, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18909, https://doi.org/10.5194/egusphere-egu26-18909, 2026.

EGU26-19798 | Orals | ITS1.21/ESSI4.5

Enhancing and Sustaining European Geosphere Services: Geo-INQUIRE’s Final Phase of Training, Transnational Access, and FAIR Data Integration 

Angelo Strollo, Fabrice Cotton, Mateus Litwin Prestes, Elif Tuerker, and Stefanie Weege and the Geo-INQUIRE project management board

Geo-INQUIRE* (Geosphere INfrastructures for QUestions into Integrated REsearch) is an EU-funded project running from October 2022 to September 2026. The project aims to enhance access to geoscientific data, products, services, and computing resources, thereby enabling open, interdisciplinary, and data-driven research across the geosphere. By integrating and strengthening European and pan-European research infrastructures, Geo-INQUIRE addresses some of the challenges posed by heterogeneous data formats, new data types, and disciplinary silos that increasingly accompany modern, data-intensive science.

A central objective of Geo-INQUIRE is to foster cross-fertilization and long-term collaboration among major European research infrastructures and initiatives, including EPOS ERIC, EMSO ERIC, ECCSEL ERIC, ChEESE CoE, and the ARISE infrasound community. Through coordinated European efforts, the project promotes the harmonisation of data policies, interoperability frameworks, and service provision. This contributes to the development and adoption of global standards for FAIR (Findable, Accessible, Interoperable, Reusable) geoscientific data and services. Geo-INQUIRE also addresses the rapidly evolving data management policies across communities, and the definition of common Key Performance Indicators for infrastructure governance.

Geo-INQUIRE leverages complementary strengths across solid Earth, marine, atmospheric, and subsurface research domains, combining observational data, advanced modelling, and high-performance computing resources. Users benefit from integrated FAIR-compliant data collections, interoperable workflows, scalable computing services, and a dedicated Transnational Access programme providing hands-on access to key testbeds, facilities, and HPC-demanding computational workflows. This enables users to perform advanced experiments, simulations, and methodological developments. Training, workshops, and summer schools provide further support for capacity building and the adoption of open and reproducible research practices, with a strong focus on promoting Equity, Diversity, and Inclusion (EDI). 

Now, in its final implementation year, Geo-INQUIRE is consolidating and assessing the outcomes of its activities, including new multidisciplinary datasets generated via Transnational Access, enhanced data services, interoperable workflows, and training materials. These results are being wrapped up, evaluated, and progressively handed over to long-term, sustainable European research infrastructures to ensure continuity, reuse, and lasting impact beyond the project lifetime. This final phase demonstrates how time-limited collaborative projects can deliver durable contributions to the European and global geoscience research landscape by embedding innovation within established, sustainable infrastructures.

* Geo-INQUIRE is funded by the European Union (GA 101058518)

How to cite: Strollo, A., Cotton, F., Litwin Prestes, M., Tuerker, E., and Weege, S. and the Geo-INQUIRE project management board: Enhancing and Sustaining European Geosphere Services: Geo-INQUIRE’s Final Phase of Training, Transnational Access, and FAIR Data Integration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19798, https://doi.org/10.5194/egusphere-egu26-19798, 2026.

EGU26-19888 | Posters on site | ITS1.21/ESSI4.5

How open and restricted data can coexist in a Data Space 

Ivette Serral, Joan Masó, Raul Palma, and Berta Giralt

A Data Space is a digital environment that enables the reliable exchange of data while retaining sovereignty and ensuring trust and security under a set of mutually agreed rules. While in Spatial Data Infrastructures (SDI) the focus was on providers opening their data to everybody, in Data Spaces the focus is on a more symmetrical and distributed data exchange among participants. These are specifically designed for sharing restricted or sensitive data, respecting privacy and supporting private companies in the digital economy.

The European Commission is promoting the creation of up to 15 common European Data Spaces that are expected to bring together relevant data infrastructure and governance frameworks in strategic sectors as part of the European Strategy for Data. The aim is to face global challenges and overcome legal and technical barriers to data sharing by combining the necessary tools and services in an interoperable and reusable way. Among these, the Green Deal Data Space (GDDS) supports the Green Deal priority actions in terms of sharing high value and high quality datasets for biodiversity preservation, zero pollution, circular economy, climate change mitigation, deforestation reduction, smart mobility and environmental compliance.

Most environmental data within GDDS originates from public administrations and is mainly open, except for GDPR-protected or sensitive species data. However, data from commercial activities—such as soil markets, farming, and textile recycling—is considered proprietary and therefore restricted. The GDDS should be designed and built respecting European values and applying FAIR principles (Findable, Accessible, Interoperable, Reusable). It is also a goal to interconnect fragmented and cross-domain data from public and private sectors, as well as citizen-generated sources, while maintaining a balance between open and restricted data. This communication explores how SDI fundamentals for open data can be combined with Data Space technologies for restricted data, ensuring the interests of all actors.

The architecture initially adopted by the GDDS is based on a piece of software called “data space connector” that follows standards defined by the International Data Space Association. The connector is providing access to restricted data based on traditional authentication or a Decentralized Claims Protocol system complemented by digital contracts. Only authorized actors can use the data. In SAGE, we are also using this architecture to share open data coming from APIs.

Due to the heterogeneous nature of data in the GDDS, the precise understanding of the meaning of this data is of a paramount importance. Thus, semantics and well-known ontologies play an important role in Data Spaces. In SAGE, we propose to use Essential Variables (EVs) as a common language to describe data. Previous work has been done in AD4GD in using Essential Biodiversity Variables together with I-ADOPT ontology framework for metadata and data description. This work will be expanded with the rest of EVs facilitating breaking the silos among data domains.    

This research is conducted in SAGE project co-funded by the European Union from the Digital Europe Programme (DIGITAL) under grant agreement Nº 101195471 and some parts were initiated under AD4GD EC HORIZON.2.6 project (Nº 101061001).

How to cite: Serral, I., Masó, J., Palma, R., and Giralt, B.: How open and restricted data can coexist in a Data Space, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19888, https://doi.org/10.5194/egusphere-egu26-19888, 2026.

EGU26-20612 | ECS | Posters on site | ITS1.21/ESSI4.5

The EMOTION Web-portal: Geochemical Data for Geothermal Exploration. 

Antonio Randazzo, Giancarlo Tamburello, Alessandro Frigeri, Manuela Sbarra, Barbara Cantucci, Daniele Cinti, Dmitri Rouwet, Emanuela Bagnato, Dino Di Renzo, Giovannella Pecoraino, Nunzia Voltattorni, Francesca Zorzi, Carmine Apollaro, Donato Belmonte, Carlo Cardellini, Franco Tassi, Stefania Venturi, Giovanni Vespasiano, and Monia Procesi

Fluid geochemistry constitutes a cost-effective and fairly reliable tool in the first phase of geothermal exploration, providing insights into several characteristics of geothermal systems, such as typology, temperature, and the extension of the water recharge areas. Accordingly, a free-access web portal, exploring detailed geochemical and isotopic data of geothermal fluids (thermal springs, mineral waters, gas-rich waters and gas vents) can facilitate and accelerate preliminary reconnaissance stages of geothermal exploration. On the other hand, a public data portal can promote data sharing and reuse. However, the intrinsic heterogeneity and variability in format and provenance of geochemical data hinder the development and widespread adoption of web portals for geochemical data. As a consequence, geochemical data web portals for geothermal fluids are currently limited in scope and not comprehensive.

In the framework of the EMOTION project (funded by the INGV-MUR Pianeta Dinamico Project), we have created the EMOTION web portal, an innovative free-access national portal designed to centralise, standardise, visualise and distribute geochemical and isotopic data of geothermal manifestations in Italy. The EMOTION web portal has been specifically designed to: 1) harmonise existing geochemical and isotopic data and upload newly and updated acquired data through dedicated data and metadata structures; 2) standardise and homogenise both geochemical and isotopic data with geospatial parameters; 3) store data and metadata in a centralised, structured and dedicated database; 4) issue interactive maps for intuitive spatial data visualisation and customisable geochemical plots. The web portal assembles about 4,500 samples of fluids of geothermal interest that can be geographically visualised based on type categories and compositional information, as the user prefers.

This web-portal upholds the Open Science and FAIR  principles, i.e., Findable, Accessible, Interoperable, and Reusable. It has been developed using the Free Open Source Software (FOSS) R project for statistical computing and specific modules for interactive exploratory data analysis, such as “Shiny”, “Plotly” and “Leaflet”. To encourage accessibility, openness and repeatability, all source codes as well as (meta)data are publicly available, and access to the web-portal is free-of-charge. (Meta)data use a formal, well-known and accessible language, promoting interoperability and reusability with any applications for storing, analysing and processing. Interoperability with existing data infrastructures, such as the European Plate Observing System (EPOS), can also be ensured through a dedicated web service that provides standardised access to data and metadata.

Besides supporting and facilitating geothermal exploration, the EMOTION web portal serves as a scalable model for analogous initiatives. On the other hand, the web-portal promotes geochemistry as a bridge with other disciplines in the investigation of all properties of geothermal reservoirs and can serve as an essential component of monitoring to ensure efficient and sustainable energy extraction, also encouraging stakeholders and researchers to create increasingly holistic infrastructures and platforms to achieve shared goals.

 

How to cite: Randazzo, A., Tamburello, G., Frigeri, A., Sbarra, M., Cantucci, B., Cinti, D., Rouwet, D., Bagnato, E., Di Renzo, D., Pecoraino, G., Voltattorni, N., Zorzi, F., Apollaro, C., Belmonte, D., Cardellini, C., Tassi, F., Venturi, S., Vespasiano, G., and Procesi, M.: The EMOTION Web-portal: Geochemical Data for Geothermal Exploration., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20612, https://doi.org/10.5194/egusphere-egu26-20612, 2026.

EGU26-20646 | ECS | Orals | ITS1.21/ESSI4.5 | Highlight

How to Train Your Tsunami Emulator: From Open Science and Research Infrastructure to Stakeholders' Needs 

Naveen Ragu Ramalingam, Alice Abbate, Erlend Briseid Storrøsten, Gareth Davies, Andrea Di Stefano, Stefano Lorito, Manuela Volpe, Steven Gibbons, Fabrizio Romano, and Finn Løvholt

Modern tsunami hazard assessment requires moving beyond slow high-fidelity simulations toward scalable hybrid frameworks that integrate physics-based numerical modelling with machine learning (ML) emulation. To ensure these "tsunami emulators" are trusted by stakeholders for tasks like hazard assessment, evacuation planning and real time forecasting, they must be developed through transparent, reproducible but tailored workflows. We present our attempt at building and testing tsunami inundation emulators designed for rapid probabilistic inundation assessment.

This work utilises of large simulation dataset derived from European research projects and computing infrastructure for training our emulator, that will be made available on the CINECA-hosted Simulation Data Lake (SDL) linked to the Geo-INQUIRE and EPOS project along with codes on open repository to allow other researchers to reproduce results, test, and also benchmark against new ML models.

We demonstrate through rigorous testing and benchmarking for an application at inundation sites in Sicily the emulator performance against full ensembles of numerical simulations and importance sampling Monte Carlo methods. Our emulation framework enables for uncertainty quantification of the emulator essential for trust and reliability in operational setting. The resulting products include probabilistic hazard maps, evacuation maps, inundation forecasts which are directly actionable for stakeholders. This example showcases a scalable path for integrating AI into solid Earth science using upcoming research infrastructures, helping bridge the gap between open science and real-world disaster resilience.

How to cite: Ragu Ramalingam, N., Abbate, A., Storrøsten, E. B., Davies, G., Di Stefano, A., Lorito, S., Volpe, M., Gibbons, S., Romano, F., and Løvholt, F.: How to Train Your Tsunami Emulator: From Open Science and Research Infrastructure to Stakeholders' Needs, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20646, https://doi.org/10.5194/egusphere-egu26-20646, 2026.

EGU26-21580 | ECS | Posters on site | ITS1.21/ESSI4.5

Sub-bottom controls on sulphurous seepage in the Mangalia Marine Protected Area, Western Black Sea 

Irina-Marilena Stanciu, Adrian Popa, Valentin Poncoş, Gabriel Ion, Constantin Lazăr, Andrei Rareş Stoian, and Adrian Teacă

The underwater sulphurous seeps in the Mangalia Natura 2000 Marine Protected Area create a distinctive habitat in the Western Black Sea, driven by interactions between subsurface geology, fluid migration, sediment dynamics, and benthic biological communities.

Building on previous geophysical and geochemical investigations carried out by the authors in Mangalia area, we present newly acquired high-resolution sub-bottom profiler (SBP) data, integrated with regional solid Earth datasets accessed through the European Plate Observing System (EPOS) Data Portal, aiming to provide new insights for the shallow subsurface characterization of the seep field.

The SBP profiles, with penetration depths of up to 20-25 m, reveal continuous, well-stratified sedimentary units overlying a high-amplitude acoustic boundary interpreted as consolidated substrata. These stratified successions are locally affected by reflector roughening, subtle flexuring, minor truncations, and zones of reduced acoustic penetration, indicating gas-charged intervals and shallow fluid escape processes, including pockmark development.

By integrating our previous geological, geophysical and geochemical investigations and the newly acquired high‑resolution SBP profiles with related datasets (geological, tectonic, geodynamic) accessed via the EPOS Data Portal, we placed the Mangalia seep field within a broader, interoperable geoscientific framework that enhanced interpretation. This cross‑disciplinary linkage allowed us to test hypotheses about the structural controls on seep localization and to correlate our surveys’ findings with deeper geology and tectonics.

Acknowledgements:

Early-stage researches on geology and tectonics, respectively habitat mapping in the study area was carried out by Irina-Marilena Stanciu and Adrian Popa as part of their PhD stages at the Doctoral School of Geology, Faculty of Geology and Geophysics of the University of Bucharest (both completed now).

In-situ data acquisition was carried out within the PN23300202 (Development of ecosystem-based approaches for the sustainability of marine biological resources (jellyfish, macrophyte algae, mollusks) and production methods to expand their biotechnological use) and PN23300101 (Management and Monitoring of the Marine Environment, as Part of the National Strategy for Assessing Regional and Global Climate Change on the Romanian Continental Shelf of the Black Sea: A Comprehensive Analysis Based on the Development of Geological, Geophysical, Biological, and Geochemical Maps at a Scale of 1:50000) research projects of the CORE Program of the National Institute of Marine Geology and Geo-ecology – GeoEcoMar (Contract No. 4N/30.12.2022), financed by the Romanian Ministry of Research, Innovation and Digitization.

Research employing the European Plate Observing System (EPOS) Data Portal was initiated within the framework of the Integrated Thematic Services in the Field of Earth Observation - A National Platform for Innovation (SETTING) project, co-financed by the European Regional Development Fund (ERDF) through the Operational Programme Competitiveness 2014-2020 (Contract No. 336/390012), and the PN-III-P3-3.6-H2020-2020-0027 project, funded by the Romanian Ministry of Research, Innovation and Digitization, CCCDI-UEFISCDI (Contract No. 8/2021), further continued in the framework of EPOS-RO.

How to cite: Stanciu, I.-M., Popa, A., Poncoş, V., Ion, G., Lazăr, C., Stoian, A. R., and Teacă, A.: Sub-bottom controls on sulphurous seepage in the Mangalia Marine Protected Area, Western Black Sea, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21580, https://doi.org/10.5194/egusphere-egu26-21580, 2026.

EGU26-595 | Orals | ESSI4.7

From Observation to Understanding: The Lithotectonic Framework as Foundation for Europe's Digital Geological Infrastructure 

Kris Piessens, Kristine Asch, Isabelle Bernachot, Paul Heckmann, Esther Hintersberger, Hans-Georg Krenmayr, Benjamin Le Bayon, Stefan Luth, María J. Mancebo Mancebo, Sandra Mink, Maxime Padel, Ondrej Pelech, José Rodriguez, Francisco J. Rubio Pascual, Jørgen Tulstrup, and Jan Walstra

Geological mapping stands at a methodological crossroads. While traditional chronostratigraphic and lithostratigraphic approaches are effective at documenting observable rock patterns and temporal sequences, modern geological applications increasingly demand maps that directly relate to geological processes and events. The Lithotectonic Framework (LTF), developed within the GSEU project (grant 101075609), revisits lithotectonic concepts from the 1970s with the first rigorous theoretical framework. Complementing parallel European initiatives (doi.org/10.1051/bsgf/2022017; doi.org/10.31223/X5RT28), it organizes geological knowledge based on our understanding of Earth's history, rather than from observed rock age or lithological composition only.

The LTF's boundary-first principle defines geological units based on the events that created them, producing maps that reflect uniform geological histories. Consider the Paris Basin and North Sea Basin that are chronostratigraphically continuous, but lithotectonically distinct: the former is linked to post-Variscan subsidence, and the latter to Atlantic rifting. This event-based approach complements traditional mapping methods: chronostratigraphy provides robust temporal correlation, lithostratigraphy captures compositional variation, while LTF reveals the tectonic and sedimentary processes that shaped Europe's geology. The framework is equally applicable to polydeformed basement and sedimentary sequences, offering a systematic treatment of overprinting relationships through a hierarchical structure.

Beyond cartographic advantages, LTF's conceptual foundation unlocks transformative digital capabilities. By describing geology conceptually rather than descriptively, its hierarchical structure translates directly into semantic knowledge systems. Unlike traditional geological databases that catalogue and describe map features, LTF knowledge bases formally encode the theoretical relationships between geological entities. This enables dynamic visualizations, such as temporal "undressing" to expose deeper or earlier geological levels, thematic extraction for applied research, and crucially, machine-assisted geological reasoning. Preliminary testing demonstrates that LTF's conceptual structure enables AI systems to reason correctly about novel geological questions, outperforming geologists unfamiliar with the framework.

The paradigm shift is profound: geological mapping evolves from producing static maps with implicit knowledge to dynamic knowledge bases, where maps become interactive visualizations of deeper insights. Traditional geological mapping discovered that rocks form traceable patterns across continents, leading to the realization that geology records Earth's history. The LTF builds upon that foundation, introducing a self-organizing framework – where structure emerges from conceptual principles – that transforms geological knowledge from implicit expertise into explicit, queryable infrastructure. For Europe's geological community, this is not a replacement but an evolution: a digital geological infrastructure that preserves the strengths of traditional mapping while unlocking computational capabilities essential for modern Earth science applications.

How to cite: Piessens, K., Asch, K., Bernachot, I., Heckmann, P., Hintersberger, E., Krenmayr, H.-G., Le Bayon, B., Luth, S., Mancebo Mancebo, M. J., Mink, S., Padel, M., Pelech, O., Rodriguez, J., Rubio Pascual, F. J., Tulstrup, J., and Walstra, J.: From Observation to Understanding: The Lithotectonic Framework as Foundation for Europe's Digital Geological Infrastructure, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-595, https://doi.org/10.5194/egusphere-egu26-595, 2026.

EGU26-2291 | ECS | Orals | ESSI4.7

Global Localization of the Perseverance Rover via Orbiter-UAV-Rover Collaborative Matching 

Zilong Cao, Xiong Xu, Qipeng Chen, Changjiang Xiao, Chao Wang, Yongjiu Feng, Huan Xie, and Xiaohua Tong

The precise global localization of the Mars rover serves as a fundamental prerequisite for long‑distance scientific traverses and in‑situ geological investigation. As Mars represents a typical GNSS‑denied environment, accurate positioning is typically accomplished through the registration of rover‑acquired imagery with orbital maps. Mainstream methodologies address the substantial perspective and scale differences between ground‑level and orbital images by first generating orthophotos from rover imagery, which are then aligned with satellite‑based imagery for localization.

The successful deployment of the Mars Helicopter (Ingenuity) enables the use of acquired UAV imagery as an intermediate bridge for the rapid and accurate global localization of the Perseverance rover. Accordingly, this study proposes an orbiter-UAV-rover collaborative matching framework, as illustrated in Fig.1. This framework sequentially performs three core steps: (1) cross-perspective matching between rover and UAV imagery, (2) cross-scale matching between UAV and orbiter imagery, and (3) a matching connection strategy that integrates the two matching sets to establish a continuous geometric transformation chain.

Figure 1. Schematic diagram of the proposed global localization framework.

Specifically, the rover-UAV image matching procedure is implemented through the following sequential steps, and the efficacy of this approach is demonstrated in Fig. 2.

(1) Horizon-based Pose Estimation: The visual horizon within the rover image is segmented using a Mask R-CNN model. This horizon line is then analytically processed to derive the pitch and roll angles of the rover camera.

(2) Cross-Perspective Image Rectification and Matching: Leveraging the estimated orientation angles, the rover image is orthographically rectified to approximate a nadir view, thereby aligning its perspective with that of the UAV imagery. A deep learning-based feature matching network is subsequently applied between the rectified rover image and the UAV image to establish dense, pixel-wise correspondences.

(3) Correspondence Projection: The matched feature points from the rectified image pair are back-projected onto their original coordinates in the raw rover image.

Figure 2. Comparison of cross-view feature matching results before and after orthographic rectification.

Following the establishment of correspondences between rover and UAV imagery, the matching results between the UAV and orbital data are subsequently derived using our previously proposed method [1]. This process culminates in the formation of a two-tier correspondence chain, effectively linking the rover, UAV, and orbiter, as visually summarized in Fig. 3.

Figure 3. Visualization of cross-platform feature matching results.

Figure 4. Results of collaborative matching and localization.

Table 1. Localization error of the Perseverance rover for different sites.

Localization experiments were conducted at multiple sites along the Perseverance rover's traverse. As shown in Fig. 4 and Table 1, multi-platform images were well-associated, achieving an average accuracy of 0.4 m (resolution of the orbital image is 0.25m). High-precision rover positioning information enables the precise fusion of multi-site local geological mapping products and ensures the accurate integration of rover and orbital-scale geological mapping products.

 

Reference:

[1]   CAO Z, FU H, XU X, et al. A Novel Template Matching Method Incorporating a Multi-Candidate Region Optimization Strategy for the Initial Localization of Mars Helicopter. Transactions in GIS, 2025, 29(2): e70052.

 

How to cite: Cao, Z., Xu, X., Chen, Q., Xiao, C., Wang, C., Feng, Y., Xie, H., and Tong, X.: Global Localization of the Perseverance Rover via Orbiter-UAV-Rover Collaborative Matching, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2291, https://doi.org/10.5194/egusphere-egu26-2291, 2026.

EGU26-4181 | Posters on site | ESSI4.7

Multi-Source Data Fusion and Multi-Scale Constraints for Continuous Surface-Subsurface 3D Geological Modeling 

Kuo-Jen Chang, Mei-Jen Huang, Chuan-Chi Wang, and Kaiyi Haung

The inherent uncertainty of subsurface geological conditions remains a primary challenge in underground spatial planning and rock engineering. The rationality of engineering design is fundamentally dictated by the spatial distribution and continuity of geological structures. However, in complex environments—characterized by intense tectonic fracturing or rapid lithological transitions—traditional 2D projections often fail to capture the anisotropic nature and spatial evolutionary trends of the rock mass, leading to significant interpretative gaps. Discrepancies between predicted and encountered geology frequently stem from a 2D conceptual framework that oversimplifies the 3D connectivity of fault planes, shear zones, and joint sets. This study addresses these limitations by utilizing the Zhaishan Tunnel system in Kinmen, characterized by its granitic basement, as a research platform. By integrating UAV LiDAR, Terrestrial Laser Scanning (TLS), and SLAM technologies, we established a high-resolution 3D spatial database that bridges the gap between surface and subsurface geological data. The core research focus is the development of a workflow for continuous surface-subsurface 3D geological modeling. By incorporating surface topography, outcrop mapping, and in-situ structural measurements into a unified 3D coordinate system, the study employs multi-scale data constraints to enhance the reliability of geological interpretations. Macro-scale surface terrain data are utilized to constrain the meso-scale structural interpretations within the tunnel, ensuring that the model maintains structural consistency across different depths. The significance of this research lies in transforming geological outputs from static, post-survey records into dynamic, 3D interpretative engines. This approach allows for the visualization of discontinuity extensions in three dimensions, providing a data-driven framework for anticipating geological hazards. Ultimately, this shift ensures that geological interpretations are no longer fragmented, providing a high-integrity information base for modern underground space development and structural stability analysis.

How to cite: Chang, K.-J., Huang, M.-J., Wang, C.-C., and Haung, K.: Multi-Source Data Fusion and Multi-Scale Constraints for Continuous Surface-Subsurface 3D Geological Modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4181, https://doi.org/10.5194/egusphere-egu26-4181, 2026.

EGU26-4801 | ECS | Orals | ESSI4.7

Building a large-scale 3D geological model of the Swiss Alps: First results 

Ferdinando Musso Piantelli, Eva Kurmann, Lukas Nibourel, Philip Wehrens, Pauline Baland, and Herwig R. Müller

From 2024 to 2030, the Swiss Geological Survey (swisstopo) leads the Swiss Alps 3D (SA3D) project as part of the swisstopo National Geological Model (NGM) program. The project brings together eight modelling and research teams from several universities with the objective to develop one coherent, large-scale 3D geological model of the Swiss Alps subsurface. The model targets the major structural and lithostratigraphic boundaries of the region and will serve as a regional geological reference framework for future high-resolution studies. It will support a wide range of applications, including infrastructure planning, groundwater management, georesource assessment, natural hazard analysis, as well as education and research.

This contribution presents results from the first two years of SA3D modelling in the Subalpine Molasse, Prealps, Helvetic, and Western Penninic tectonic domains, with emphasis on practical solutions developed to address key methodological challenges. The SA3D models are structured around four core components: (i) input datasets, (ii) 2D geological maps, (iii) reference cross-sections, and (iv) 3D meshes. Ensuring internal consistency among these elements, both at the surface and at depth, represents a primary challenge. This challenge is amplified by sparse subsurface data, limited seismic profiles and boreholes, the large extent of the study area, and the extreme structural complexity of the Alpine Orogen. These constraints limit the range of applicable modelling approaches (implicit versus explicit) and require rigorous integration of all components. Coordinating eight independent projects to produce a unified, technically and conceptually consistent model demands close collaboration and methodological harmonization across the different modelling teams.

By addressing these challenges, SA3D provides unprecedented insight into the largely unexplored Alpine subsurface. Reconstruction of the three-dimensional network of lithostratigraphic contacts and structures reveals large-scale structural and lithological patterns down to depths of  10 km, significantly improving our understanding of regional tectonic evolution. Beyond the resulting 3D model and its scientific outcomes, SA3D promotes a collaborative community of Alpine geologists and 3D geological modellers, setting the stage for continued for continued high-level research and exploration of the Alpine subsurface.

How to cite: Musso Piantelli, F., Kurmann, E., Nibourel, L., Wehrens, P., Baland, P., and Müller, H. R.: Building a large-scale 3D geological model of the Swiss Alps: First results, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4801, https://doi.org/10.5194/egusphere-egu26-4801, 2026.

EGU26-6514 | ECS | Orals | ESSI4.7

Unravelling the tectono-stratigraphic link of buried and exposed structural fronts of the Northern Apennines through integrated geological mapping (CARG Sheet 160 Pavia, Italy) 

Paola Bellotti, Francesca Stendardi, Daniel Barrera, Andrea Di Giulio, and Giovanni Toscani

Understanding the structural and stratigraphic connection between the exposed sectors of collisional belts and their external thrust front buried beneath foreland basin sediments remains a long-debated issue, largely due to the need for integration of heterogeneous surface and subsurface datasets. This research focuses on the Northern Apennines front, at the transition between the exposed Oltrepo Pavese hillslopes and the buried thrust front beneath the Po Plain, investigated within the Italian Geological Mapping Project (CARG, Pavia Sheet - 160). Detailed field mapping resulted in an original 1:10.000 scale geological map of the exposed belt, supported by petrographic and biostratigraphic analyses. These surface data were integrated with seismic profiles and deep well data, from a regional 3D model of the central Po Plain, which reconstruct the geometry of the buried thrust system displacing the Plio-Pleistocene sequence in the Po Plain, in order to link the exposed and buried stratigraphic units. The integration of surface and subsurface data allows the recognition of structures otherwise hided beneath vegetation and Quaternary colluvial covers in the hillslopes. In particular, seismic interpretation allows to localize buried structures and constrains their geometric relationships with respect to the attitude of the outcropping units.

The stratigraphic record in the exposed area includes Paleocene to Lower Eocene turbiditic succession of the Val Luretta Formation, which thrusts over a Tortonian to Piacentian sequence. This latter records a progressive shallowing upward trend in the environment, from a deep-sea setting to shallow marine, deltaic and then continental environments associated with the Messinian Salinity Crisis, followed by the Pliocene marine transgression.

Interpretation of subsurface dataset allows the recognition of a south-dipping, north-verging thrust system affecting both the exposed and the buried units, with multiple splays and blind thrusts active until the Lower Pleistocene.

These results provide new constraints on the geometry and the evolution of this sector of the Northern Apennines front, demonstrating the effectiveness of combining field-based geological data with subsurface data to link outcropping and buried portions of orogenic belts

How to cite: Bellotti, P., Stendardi, F., Barrera, D., Di Giulio, A., and Toscani, G.: Unravelling the tectono-stratigraphic link of buried and exposed structural fronts of the Northern Apennines through integrated geological mapping (CARG Sheet 160 Pavia, Italy), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6514, https://doi.org/10.5194/egusphere-egu26-6514, 2026.

EGU26-6560 | Posters on site | ESSI4.7

Optimization of Modeling Accuracy for Mobile Mapping Systems in Large-Scale Environments 

Kai-Yi Huang, Chuan-Chi Wang, Jung Chiang, and Kuo-Jen Chang

Geospatial data acquisition technology has been widely integrated into geological and engineering geology research, significantly enhancing the spatial precision and structural integrity of topographical interpretations. In the era of high-performance computing, 3D geological modeling has emerged as a pivotal trend for engineering applications. However, the practical depth of these models is often constrained by challenges in accuracy and reliability, arising from varying data resolutions and the complexities of integrating multi-source information. These issues complicate model validation, particularly in large-scale or high-complexity engineering environments. As data collection methods become increasingly diverse, Simultaneous Localization and Mapping (SLAM) technology has revolutionized traditional surveying by offering superior operational flexibility and mobility. Unlike static terrestrial laser scanning (TLS), handheld or mobile LiDAR systems (MMS) can efficiently traverse indoor spaces, narrow urban corridors, and densely vegetated areas, facilitating the construction of comprehensive, blind-spot-free 3D spatial datasets. Despite these advantages, achieving and maintaining engineering-grade precision in GNSS-denied or signal-unstable environments remains a critical technical bottleneck. This study aims to investigate a robust workflow for large-scale field model construction using a "batch processing and stitching fusion" strategy. Using the National Taipei University of Technology (NTUT) campus as an experimental field, high-density point cloud data were collected using the mobile mapping system. The research methodology focuses on optimizing geometric fidelity by rigorously analyzing two key variables: first, a comparative evaluation of trajectory adjustment modes, specifically contrasting loop-closure correction with Post-Processed Kinematic (PPK) technology; and second, an assessment of how the quantity and spatial distribution of Ground Control Points (GCPs) influence the model’s global stability and absolute correctness.

The experimental results demonstrate that through optimized GCP deployment and refined trajectory adjustment, the absolute accuracy of the point cloud model can be maintained within an RMSE of 5 cm, with the relative accuracy on ground surfaces controlled within 2 cm. Furthermore, in the measurement of high-rise structures, the ghosting effect (layering) is restricted to within 4 cm at a 30-meter operational radius, while an average point spacing of 4 cm is maintained to ensure the geometric integrity of model details. These findings confirm that mobile LiDAR systems, when supported by optimized workflows, can meet the stringent precision requirements of engineering-grade projects while retaining high flexibility.

In conclusion, this research establishes a high-precision 3D digital foundation for the campus. This methodology is highly extensible to geological fields, including outcrop geometric measurement, quantitative analysis of landslide volumes, and structural surveys in GNSS-denied environments such as tunnels and caves.

How to cite: Huang, K.-Y., Wang, C.-C., Chiang, J., and Chang, K.-J.: Optimization of Modeling Accuracy for Mobile Mapping Systems in Large-Scale Environments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6560, https://doi.org/10.5194/egusphere-egu26-6560, 2026.

EGU26-6674 | Posters on site | ESSI4.7

Precision and Accuracy Evaluation of 3D Modeling in Indoor Confined Environments: Integrating Mobile Mapping System and BIM 

Chuan-Chi Wang, Kai-Yi Huang, Jung Chiang, and Kuo-Jen Chang

With the increasing demand for 3D spatial data in engineering and geological applications, constructing practical 3D models efficiently and effectively has become a critical challenge in geology, underground engineering, and architectural documentation. In recent years, Simultaneous Localization and Mapping (SLAM) technology has been widely adopted in complex environments to collect high-density point clouds with high efficiency. However, the reliability and applicability of SLAM-derived results in geological and engineering contexts still require verification through practical case studies. This research utilizes the Building of the Civil Engineering at National Taipei University of Technology as the primary experimental site. A mobile SLAM system was employed to collect 3D point cloud data, which was subsequently integrated into the Building Information Modeling (BIM) framework—a standard in Taiwan's engineering industry—to assist in model construction and application. Furthermore, the study extends to several representative engineering and geological sites, including the Zhaishan Tunnel in Kinmen, the Kinmen Ceramics Factory, and the coastal rock outcrops at Qixingtan in Hualien, to explore the feasibility of SLAM-based 3D modeling under diverse environmental conditions.Regarding engineering applications, this study compares different positioning modes, including pure SLAM, SLAM combined with PPK, and SLAM integrated with RTK. Both absolute and relative accuracy at the architectural scale were analyzed using control points. Additionally, the impact of control point distribution on the geometric consistency of the models was investigated. These findings serve as a technical reference for selecting SLAM positioning strategies and operational workflows in engineering practice.In terms of geological and underground engineering applications, the research focuses on using SLAM point clouds for the 3D reconstruction and visualization of tunnel morphology, rock wall geometric features, and coastal outcrops. The results demonstrate the potential of this technology in tunnel geological recording, engineering planning, and outcrop preservation, providing a foundation for geological modeling in analytical tasks. In conclusion, this study proposes a practice-oriented workflow that integrates SLAM point clouds with BIM. By balancing engineering precision analysis with geological modeling applications, this research provides a high-efficiency 3D modeling solution with significant practical value for the architectural, tunneling, and geological sectors.

How to cite: Wang, C.-C., Huang, K.-Y., Chiang, J., and Chang, K.-J.: Precision and Accuracy Evaluation of 3D Modeling in Indoor Confined Environments: Integrating Mobile Mapping System and BIM, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6674, https://doi.org/10.5194/egusphere-egu26-6674, 2026.

EGU26-6989 | Orals | ESSI4.7

In search of lithological truth – sceptical non-geologists in the non-English speaking world 

Urszula Stępień, Daniel Zaszewski, Aleksandra Fronczak, and Wiktor Witkowski

The main objective of the study was to check how popular, free versions of AI chatbots cope with questions related to lithology. The assumption of the study was that a potential user is not a geologist, does not know how to formulate prompts correctly, and is sceptical enough about new technologies that they avoid logging in. Lithological issues may occur, for example, in descriptions of educational paths. The entire study was conducted in Polish. In order to shorten the study time, all prompts were formulated, and their order was imposed. The aim was, among other things, to see how the answer would differ depending on how precise the question was. In addition, the prompts were deliberately designed not to comply with the rules for asking questions, as we assumed that potential users would lack such knowledge. We asked people with geological knowledge to participate in the study so that they could assess its substantive value after receiving the results.

The rapid expansion of large language models (LLMs) into scientific workflows raises important questions concerning their reliability, transparency, and suitability for specialised disciplines such as the geosciences. This contribution presents the results of a survey-based assessment of selected AI-powered tools conducted in Polish between February and May 2025. The study involved 202 respondents, including professional geologists, academic staff, and students of geosciences, who evaluated AI-generated responses to seven tasks of varying complexity.

The study confirmed that the precise formulation of queries, especially those specifying source requirements and an expert-level perspective, substantially improves the quality of AI-generated content. This effect was particularly evident in questions involving linguistically ambiguous terms, where models often addressed only one interpretation while omitting alternative meanings relevant to geological sciences. Such omissions may result in incomplete or misleading answers when the user lacks sufficient domain knowledge to identify inaccuracies.

The opinions expressed in the Polish-language survey present an ambivalent picture. While the functional benefits and efficiency gains offered by AI tools are widely recognised, substantial methodological, substantive, and ethical limitations remain. The competence and awareness of the user have been identified as pivotal factors in determining whether the adoption of AI results in the creation of genuine value or the dissemination of errors and misinformation. The study emphasises the necessity for enhanced citation practices, the prioritisation of peer-reviewed literature, an augmentation in the number of high-quality non-English open geological publications, an enhancement in the semantic understanding of specialised terminology, and the development of regionally adapted language models. These measures are considered essential for ensuring transparent, reliable, and responsible use of AI in geoscientific research and communication.

How to cite: Stępień, U., Zaszewski, D., Fronczak, A., and Witkowski, W.: In search of lithological truth – sceptical non-geologists in the non-English speaking world, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6989, https://doi.org/10.5194/egusphere-egu26-6989, 2026.

EGU26-7419 | Posters on site | ESSI4.7

Strong or weak? What controls magnetic anomalies in the Admiralty igneous complex, northern Victoria Land, Antarctica 

Antonia Ruppel, Barbara Ratschbacher, Nikola Koglin, and Andreas Läufer

The Devonian-Carboniferous Admiralty igneous complex (i.e. Admiralty plutonites and Gallipoli volcanics) of northern Victoria Land, Antarctica, forms part of a widespread magmatic system comprising felsic volcanic, subvolcanic and plutonic lithologies. Due to extensive snow and ice coverage, aeromagnetic data has been used to interpret the extent of igneous bodies where surface exposure of igneous rocks is limited. However, some exposures generate strong positive magnetic anomalies, while others produce weak or negligible responses, raising questions about the factors controlling magnetic susceptibility and interpretation of aeromagnetic data where exposure is absent.

We focus on several key locations with exposed Admiralty igneous rocks showing strong positive anomalies (Everett, Salamander and southern Alamein ranges, Mariner Plateau), negligible anomalies (Tucker Glacier region), and a combination of weak and strong anomalies (Yule Bay) to explore how variations in rock properties and geochemical composition relate to observed magnetic anomalies.

Combining aeromagnetic surveys and in-situ susceptibility measurements with detailed petrology, modal mineralogy, whole-rock geochemistry (major, minor, and trace elements) and ongoing age dating allows a better understanding of the causes of low versus high magnetic anomalies in rocks previously ascribed to a single magmatic event. In particular we are testing whether (a) multiple, compositionally distinct magmatic pulses, (b) variable degrees of alteration, and/or (c) different levels of exposure can account for the observed discrepancies in magnetic anomalies.

Magnetic and susceptibility data, when combined with petrological and geochemical analyses, provide a powerful tool to investigate the origin of variations in magnetic susceptibility, particularly in regions with limited outcrops.

How to cite: Ruppel, A., Ratschbacher, B., Koglin, N., and Läufer, A.: Strong or weak? What controls magnetic anomalies in the Admiralty igneous complex, northern Victoria Land, Antarctica, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7419, https://doi.org/10.5194/egusphere-egu26-7419, 2026.

EGU26-7588 | Orals | ESSI4.7

Differentiable Geomodelling: Towards Geomodel Insight and Task-Oriented Sensitivity Analysis 

Florian Wellmann and Miguel de la Varga
Structural geological models are widely used for the prediction of geological structures and properties in science and engineering tasks. These predictions are often related to specific questions, for example the reservoir depth at a target location, unit thickness along a planned well trajectory, or distance-to-fault for safe subsurface storage. However, understanding which input parameters most strongly influence these task-specific quantities of interest (QoIs) remains challenging, particularly when models involve hundreds to thousands of input parameters.

In this contribution, we evalaute how automatic differentiation techniques, implemented in modern machine learning frameworks, can help.
While automatic differentiation and adjoint methods have become established tools in geophysical inversion and reservoir simulation, their systematic application to structural geological modeling with sensitivities to geometric features such as depth, thickness, or distance-to-fault remains limited. In this work, we introduce \emph{differentiable geomodelling} as a practical pathway to task-oriented sensitivity analysis. Building on implicit structural modelling concepts and the open-source geomodelling library GemPy, we formulate QoIs that remain differentiable with respect to geological inputs and compute local sensitivities via automatic differentiation using modern machine-learning frameworks (PyTorch).

The approach is tested in simplified settings and a realistic scenario with tens of input points and orientation measures. The results show that, rather than replacing global sensitivity analysis or uncertainty quantification, the proposed approach complements existing methods by providing an efficient screening and structuring tool for additional insight.

How to cite: Wellmann, F. and de la Varga, M.: Differentiable Geomodelling: Towards Geomodel Insight and Task-Oriented Sensitivity Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7588, https://doi.org/10.5194/egusphere-egu26-7588, 2026.

EGU26-7883 | ECS | Posters on site | ESSI4.7

Modern applications for basin-wide revision mapping in the Old Red Sandstone, Scotland 

Theodore Reeves, Katie Whitbread, Timothy Kearsey, Tara Stephens, Sarah Arkley, Holly Unwin, Ben Murphy, Eileen Callaghan, and Torin Hughes

The Strathmore Basin is an extensive Silurian-Devonian basin which spans the entire width of Scotland. This basin has had a long and complex tectonic history, including periods of significant volcanic activity, faulting, basin folding, and several movements along the basin-bounding Highland Boundary Fault. Today, the basin is largely covered by substantial glacial deposits; bedrock exposure is limited.

Some areas of the basin were last mapped in the 1880’s (i.e., before aerial photography, and nearly a century before the theory of plate tectonics). Progressive mapping of adjacent map sheets up to the 1970’s has led to mismatches at sheet boundaries, significant inconsistencies in structural interpretation, and irregularities in stratigraphic relationships. Addressing these legacy issues in geological maps is critical for ensuring suitability for 21st century applications; these data are used to inform the management of the regional aquifer within the Devonian sandstones, and for evaluation of potential geothermal energy resources.

A novel basin-wide approach has been taken to revise the geological mapping to improve map quality and consistency across the Strathmore Basin. This has involved a range of techniques, including digital terrane analysis, targeted field visits, the integration of published geochronological data, and the compilation of basin-wide datasets of over 4,000 structural measurements and more than 20,000 observation points from multiple BGS data sources. This approach has allowed for a new large-scale structural interpretation of the fold and fault systems, particularly related to the Highland Boundary Fault, as well as a new understanding of key stratigraphic markers and a more coherent representation of the geology across the basin. This approach highlights the value of using both modern and historic datasets, and crucially, revisiting targeted outcrops in the field.

As traditional survey styles become less affordable, and the need for seamless maps more acute, regional approaches provide an important methodology, helping to maximise the value of existing data and targeting areas for new data collection. Understanding these strengths and limitations is essential for the future of resurvey, especially in countries such as the UK with a long surveying history and high demand for accurate and consistent geological information to manage energy, water, and mineral resources.

How to cite: Reeves, T., Whitbread, K., Kearsey, T., Stephens, T., Arkley, S., Unwin, H., Murphy, B., Callaghan, E., and Hughes, T.: Modern applications for basin-wide revision mapping in the Old Red Sandstone, Scotland, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7883, https://doi.org/10.5194/egusphere-egu26-7883, 2026.

EGU26-8046 | Posters on site | ESSI4.7

3D geological modelling at the Swiss Geological Survey: Development of national-scale models 

Eva Kurmann-Matzenauer, Philip Wehrens, Ferdinando Musso Piantelli, Salomè Signer, Anina Ursprung, and Lance Reynolds

Within the framework of the National Geological Model (NGM), a long-termed federal program (2022–2030), the Swiss Geological Survey (swisstopo) is developing a series of three-dimensional geological models at national scale. The primary objective is to achieve full spatial coverage of Switzerland with harmonized 2D and 3D geological models representing the geometry of major tectonic structures, lithostratigraphic units, and the bedrock surface. These models form a consistent geological framework that supports sustainable subsurface use and long-term spatial planning.

Switzerland comprises three principal geological domains with contrasting structural styles and stratigraphic architectures: the Jura fold-and-thrust belt, the Foreland Plateau, and the Alpine orogenic domain. These domains differ significantly in terms of deformation mechanisms, lithological complexity, data availability, data type and depth of geological investigation. This requires domain-specific modelling strategies and tailored approaches to uncertainty management. In addition, subsurface utilization and associated societal demands, such as infrastructure development, groundwater management and hazard assessment, vary markedly between regions.

The 3D modelling group at swisstopo has implemented a domain-based modelling strategy by subdividing Switzerland into three regional modelling areas corresponding to the main geological domains. For each domain, regional-scale 3D geological models are constructed through the integrated interpretation of surface geological maps, borehole and geophysical data, cross-sections and geological concepts and constraints. These models provide a consistent structural and stratigraphic framework that translates traditional geological mapping into digital, reproducible subsurface representations suitable for national-scale applications.

This contribution presents an overview of the current status of four complementary modelling projects developed by the 3D Group at the Swiss Geological Survey: swissBEDROCK, Jura3D, GeoMol, and swissAlps 3D.

swissBEDROCK provides a nationwide 3D bedrock model of Switzerland based on an automated and reproducible workflow with explicit uncertainty representation and regular versioned updates. Jura3D focuses on high-resolution structural and stratigraphic modelling of the folded and thrust-faulted sedimentary sequences of the Jura fold-and-thrust belt. GeoMol addresses the Foreland Plateau at regional scale, emphasizing stratigraphic architecture and basin geometry. swissAlps 3D targets the structurally complex Alps, with a strong emphasis on the tectonic development of the main lithostratigraphic and structural units supported by scientific argumentation. This contribution further highlights the importance of collaborative workflows involving federal and cantonal authorities, academia, and private partners in the development of consistent national 3D geological models.

These projects together illustrate how diverse geological modelling approaches are integrated within a coherent national framework. Moreover, they bring together geological knowledge and 3D modelling workflows across contrasting geological domains.

How to cite: Kurmann-Matzenauer, E., Wehrens, P., Musso Piantelli, F., Signer, S., Ursprung, A., and Reynolds, L.: 3D geological modelling at the Swiss Geological Survey: Development of national-scale models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8046, https://doi.org/10.5194/egusphere-egu26-8046, 2026.

EGU26-8372 | ECS | Posters on site | ESSI4.7

From Deterministic to Probabilistic: Quantifying Layer Boundary Uncertainty in Hydrostratigraphic Models 

Rasmus Bødker Madsen, Ingelise Møller, Frederik Falk, Lars Troldborg, and Anne-Sophie Høyer

Hydrostratigraphic models are commonly used as structural frameworks for groundwater and subsurface studies. Traditionally, these models are treated as deterministic representations, providing a single “best estimate” of subsurface structure. While practical, this approach conceals the inherent uncertainty in geological interpretation, particularly in the spatial placement of layer boundaries, and limits the transparency and robustness of subsequent modelling workflows. Recognising and quantifying this uncertainty is a necessary step towards more probabilistic approaches to hydrostratigraphic modelling.

This contribution presents GDM (geology-driven modelling), a method for explicitly quantifying interpretation uncertainty in the placement of hydrostratigraphic layer boundaries through ensembles of 3D subsurface realisations. GDM operates on existing hydrostratigraphic models, assuming a fixed framework in terms of layer definition and conceptual interpretation, while focusing on the spatial variability of layer interfaces. The method is computationally efficient, enabling application at regional or national scales. Its national-scale implementation, allows interpretation uncertainties to be assessed across entire hydrostratigraphic frameworks, providing a consistent basis for revisiting legacy models.

As an illustration, we demonstrate how GDM was used to quantify interpretation uncertainties in the national-scale hydrostratigraphic model of Denmark and how the resulting ensemble of subsurface realisations was incorporated into the hydrological modelling workflow. The ensemble describes the range of equally plausible geometries supported by the available data and assumptions, providing a structured way to explore how interpretation uncertainty propagates through geological models.

This example serves as a starting point for reflecting on broader implications. In particular, it illustrates how approaches that explicitly quantify interpretation uncertainty can help bridge the gap between established deterministic models and future strategies that increasingly embrace probabilistic representations. At the same time, these approaches introduce new considerations for both modellers and users/end-users of geological models.

How to cite: Madsen, R. B., Møller, I., Falk, F., Troldborg, L., and Høyer, A.-S.: From Deterministic to Probabilistic: Quantifying Layer Boundary Uncertainty in Hydrostratigraphic Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8372, https://doi.org/10.5194/egusphere-egu26-8372, 2026.

EGU26-8376 | Orals | ESSI4.7

3D geological and geotechnical subsurface model for the Einstein Telescope study area in Sardinia (Italy) 

Lorenzo Lipparini, Matteo Cagnizi, Flavia Ferranti, Peppe Junior Valentino D'Aranno, Giuseppe Sappa, Wissam Wahbeh, Quintilio Napoleoni, and Maria Marsella

The Einstein Telescope (ET) research infrastructure is envisioned as Europe’s pioneering next-generation underground observatory for gravitational-wave detection.

Its engineering design requires a multi-criteria approach capable of identifying and addressing geological, geotechnical, environmental, and landscape challenges. To manage these complexities, the ET-3G Lab at Sapienza University of Rome (as part of the ETIC PNNR project), has produced an advanced digital multi-scale 3D model for the Sardinia site identified as a potential location.

The model integrates surface and subsurface data at both regional and local scales, consolidating all available geological, geophysical, and geotechnical datasets to support a coherent reconstruction of key subsurface features, including lithotypes, faults, and fracture networks. It incorporates data from surface observations and drilled calibration wells, encompassing geological and petrophysical information, laboratory tests on undisturbed samples, fracture analyses, and geophysical investigations conducted by the ET scientific community. This integrated representation strengthens the linkage between surface and subsurface information.

As a result, a comprehensive 3D geological model of the ET Sardinia site has been developed, enabling visualization of the subsurface down to a depth of approximately one kilometer.

This advanced modeling approach is intended to support the minimization of geotechnical risks, the optimization of construction strategies and associated costs, and the implementation of scenario-based design analyses.

How to cite: Lipparini, L., Cagnizi, M., Ferranti, F., D'Aranno, P. J. V., Sappa, G., Wahbeh, W., Napoleoni, Q., and Marsella, M.: 3D geological and geotechnical subsurface model for the Einstein Telescope study area in Sardinia (Italy), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8376, https://doi.org/10.5194/egusphere-egu26-8376, 2026.

Three-dimensional geologic modeling is a well-established technique developed in the last twenty years and currently applied in terrestrial mining, environmental management, and hydrogeology [1,2]. It also represents a critical frontier for planetary exploration, from fundamental scientific research and the search for subsurface life to operational applications such as mission planning or In-Situ Resource Utilization (ISRU).  As missions increasingly target the shallow subsurface of the Moon and Mars, reconstructing subsurface architectures from available observations has become essential.

The primary challenge in planetary subsurface modeling lies in the extreme scarcity of direct subsurface data compared to the abundance of orbital remote sensing observations. Consequently, geologic mapping becomes the foundational prerequisite, providing the primary spatial and qualitative data needed to interpolate and propagate geologic contacts through three-dimensional volumes.

This work explores modeling approaches through experiments designed to test their applicability to planetary science. These include a volumetric model of Tempe Terra on Mars based solely on geological map information, and a benchmark study of a terrestrial impact crater using sparse drilling data to define the contact between bedrock and impact ejecta. Key findings relate to uncertainty evaluation and the importance of defining modeling objectives, which directly affect model complexity.

This research emphasizes avoiding "black box" solutions by adopting Free and Open Source Software workflows to ensure interoperability, traceability, and reproducibility—critical requirements in the demanding operational context of space exploration. Current results and modeling environments are promising for extraterrestrial applications. By integrating scientific reasoning with advanced interpolation algorithms, three-dimensional geologic modeling can generate robust predictive models essential for planning future robotic and human exploration missions.

References: [1] P. Calcagno et al. en. In: Physics of the Earth and Planetary Interiors 171.1-4 (Dec. 2008), pp. 147–157. [2] F. Wellmann and G. Caumon. In: Advances in Geophysics. Vol. 59. Elsevier, 2018, pp. 1–121.

Acknowledgements: This study is carried out within the Space It Up project funded by the Italian Space Agency, ASI, and the Ministry of University and Research, MUR, under contract n. 2024-5-E.0 - CUP n. I53D24000060005.

How to cite: Frigeri, A.: Three-Dimensional Geologic Modelling Beyond Earth: Challenges and Perspectives, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8462, https://doi.org/10.5194/egusphere-egu26-8462, 2026.

EGU26-8882 | Orals | ESSI4.7

Geological Maps and Data Gaps Assessment: The Key factors for a Solid Geological Background 

Urszula Stępień, Hans-Georg Krenmayr, Kristine Asch, Paul Heckmann, Kris Piessens, Dana Capova, Pavla Kramolisova, and Maria Mancebo

In 2007, the INSPIRE Directive became a catalyst for re-examining fundamental geological data represented on maps. A major milestone was the OneGeology-Europe project (2008–2010), which for the first time approached 1:1,000,000-scale geological maps as structured datasets. With the involvement of nearly all EuroGeoSurveys member surveys, GeoSciML, the OGC standard for geological data exchange, was adopted and tested, providing feedback that helped consolidate the standard. In parallel, datasets were documented using metadata compliant with ISO 19115 and the INSPIRE metadata profile.
These initiatives encouraged geological surveys to intensify efforts towards the development of geological vector maps at larger scales. However, such work is highly time-consuming and labour-intensive, and despite significant progress over the years, substantial challenges and data gaps remain. To address them effectively, gaps need to be identified and assessed to provide a clear basis for coordinated action.  Advances in geoscientific knowledge frequently require renewed field investigations and the revision of existing maps/data sets to improve the accuracy and quality. 
The GSEU project aims to identify gaps not only in terms of missing data, but also with respect to completeness and consistency, the nature of attributes describing geological units, as well as issues of semantic and geometric harmonisation across map series. Such harmonisation challenges often reflect the evolution of scientific knowledge, classification schemes and mapping best practice over time.
A robust foundation, provided by fundamental geological maps ranging from continental-scale overview products such as IGME5000 to highly detailed maps depicting local geological structures, is essential for guiding future research and development. Geological maps form the foundation for a wide range of applied and scientific activities, including mineral resource exploration, geo-energy assessments, groundwater modelling, geoengineering, vulnerability assessments, spatial planning, and subsurface management.
This contribution presents initial assessments of the current state of geological data coverage across Europe and highlights the importance of comprehensive, harmonised and well-structured  geological map databases for emerging applications, including artificial intelligence (AI) and large language models (LLM).
The GSEU project will also provide an organisational, technical and semantic framework for the digitisation, harmonisation and presentation of datasets describing Europe’s fundamental geology at multiple scales.

How to cite: Stępień, U., Krenmayr, H.-G., Asch, K., Heckmann, P., Piessens, K., Capova, D., Kramolisova, P., and Mancebo, M.: Geological Maps and Data Gaps Assessment: The Key factors for a Solid Geological Background, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8882, https://doi.org/10.5194/egusphere-egu26-8882, 2026.

Exploration datasets such as borehole logs and geophysical profiles form the fundamental basis of geological modeling. Among these, borehole records are particularly influential, as they typically include detailed descriptions and interpretations of petrography and stratigraphy. Such information is essential for constructing three-dimensional representations of lithostratigraphic units, which can be affected by inconsistencies or errors skewing borehole interpretations. Distinguishing reliable borehole data from problematic records is therefore critical, but becomes increasingly challenging when dealing with large datasets. Although visual assessment of the resulting geological models can help identify questionable boreholes, this approach typically requires many iterative modeling steps, making the process inefficient and costly.
To improve the efficiency of borehole data quality assessment, we developed B-QualMT, a Python-based borehole quality management tool with a GUI interface that enables automated filtering of borehole records using both a user-defined quality check as well as a purely data-driven approach. The software applies a suite of deterministic tests that incorporate auxiliary information such as existing 3D geological models and regional geological knowledge, including expected stratigraphic successions, to identify anomalous borehole logs within geologically similar areas. Furthermore, spatial outliers can be identified using a combination of borehole similarity analysis, various clustering techniques, and a Bayesian-based novelty detection system. To evaluate the functionalities and edge cases of these methods, synthetic borehole data besides real borehole data were used. Different test scenarios were utilized to systematically control and test the outlier detection approaches, enabling workflow optimization and a detailed assessment of their performance, limitations, and sensitivity under controlled synthetic conditions. The limitations identified during testing with synthetic data are subsequently used to inform and improve the interpretation of results derived from more complex real borehole logs.

How to cite: Schönfeldt, E., Hiller, T., and Giese, J.: How to find the baddies - a borehole quality management and outlier detection software for 3D-model data selection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9273, https://doi.org/10.5194/egusphere-egu26-9273, 2026.

EGU26-10293 | Posters on site | ESSI4.7

Geological maps of the Future: Leveraging on the methodology of the 1:5M Map to construct a 1:1 Geologic Map of the World 

Manuel Pubellier, Harvey Thorleifson, Yang Song, Benjamin Sautter, James Ogg, Francois Robida, Matthew Harisson, Pierre Nehlig, and Jorge Gomez Tapias

The efficiency of having a simple scheme for creating small scale international geological maps and to offer them in a simple, usable and standardised format has been showcased by the international collaboration of the Commission for the Geological Map of the World (CGMW), the Deep Time Digital Earth (DDE) and the CAGS (Chinese Academy of Geological Sciences) programme, and by some Geological Surveys. The success of the World 1:5M map pilot project and its follow-up toward multi-layers products has given us the confidence to achieve a unified World Geological Map at the scale of 1:1M., a dream initially envisaged by the OneGeology project.

A spectacular milestone of the global 1:5M map, the largest seamless digital geological map ever compiled, was the first phase. A following phase of this program is to create the first “basement map” of the world, by simply removing the youngest sediments from sedimentary basins and continental shelves.

While layering techniques such as basement mapping is accelerating, a new vivid vision is to compile a rigorous 1:1M global bedrock geology under protocols for sharing and regular updating of databases from willing Surveys. Compiling data into a harmonized Geological Map of the World at 1:1M scale is now the new ambitious objective of CGMW. The endeavour poses scientific, technical and geopolitical challenges, and will require the participation and efforts of partners from as many countries as possible, who must be willing to openly share information, as well as the active involvement of experts. Building on the robust methodology used for the 1:5M, we are exploring options to foster the harmonization, including using AI tools.

However, not all the national source maps are available in digital format and in English, use the same coordinate system, or comprehensive databases. Therefore, we anticipate the necessity to digitize or vectorize some geological data and to arrange a standardized database for all the maps. In some cases, boundary contrasts of resolution will require additional work. Another time-consuming task will be the cross-border correlation of geological structures and units by applying high-quality digital terrain models (DTMs), multi-spectral satellite data, or larger scale regional maps. Finally, the validation of the data by experts and Geological Surveys will be necessary. This initial digital mapping will be completed in 2D as a first step toward a future 3D geological map and a powerful Digital Twin. The multi-layer 3D version will be developed in the long term as data availability, priority, and partnerships allow. Our EGU2026 poster and associated discussions are an ideal opportunity to present the 1:1M project and to foster collaborations, for example with CGI and the OneGeology

How to cite: Pubellier, M., Thorleifson, H., Song, Y., Sautter, B., Ogg, J., Robida, F., Harisson, M., Nehlig, P., and Gomez Tapias, J.: Geological maps of the Future: Leveraging on the methodology of the 1:5M Map to construct a 1:1 Geologic Map of the World, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10293, https://doi.org/10.5194/egusphere-egu26-10293, 2026.

EGU26-10753 | ECS | Posters on site | ESSI4.7

EMODnet Geology continues to advance marine geological data for Europe  

Anu M. Kaskela, Susanna Kihlman, Aarno T. Kotilainen, Joonas Wasiljeff, and EMODnet Geology network

EMODnet Geology, one of the thematic pillars of the European Marine Observation and Data Network (EMODnet), harmonises and delivers pan European geoscientific data to support sustainable marine management. Since its launch in 2009, EMODnet Geology has successfully integrated diverse marine geological datasets covering seabed substrate, sedimentation rates, seafloor geology, coastal behaviour, geological events, marine minerals, and submerged landscapes into harmonised data products accessible via the EMODnet Portal: https://emodnet.ec.europa.eu/en. The thematic network spans the European regional seas and extends into the Caribbean Sea.

EMODnet Geology focuses on delivering harmonised data products (e.g., thematic maps) while providing metadata links to original data providers. By transforming fragmented datasets into standardized, interoperable products, it supports maritime spatial planning, environmental assessments, and sustainable resource management. The project also facilitates third-party data contributions via direct submission or through EMODnet Data Ingestion, engaging both public and private sector data holders.

A new project phase (September 2025–September 2027), coordinated by the Geological Survey of Finland GTK and executed by a consortium of 39 organisations from EuroGeoSurveys and other expert institutions, introduces significant enhancements in thematic coverage and data quality. These developments include compilation of novel datasets on organic carbon content of seabed sediments, carbon-14 measurements of strata, geotechnical properties of seabed as well as flora and fauna on the submerged landscapes. In addition, the network continues updating its existing data products with new and refined data. EMODnet Geology also contributes to the European Digital Twin Ocean (EDITO), by supporting the development of a shared, cloud-based data lake and enabling next-generation digital ocean applications.

EMODnet Geology, along with other EMODnet thematics: bathymetry, biology, chemistry, human activities, physics, and seabed habitats, provides open-access, FAIR in situ marine data and data products all accessible via the EMODnet Portal. These datasets support a wide range of scientific, policy, and industrial applications.

The current EMODnet Geology phase is funded by The European Climate, Environment and Infrastructure Executive Agency (CINEA) through contract CINEA/EMFAF/2024-25/3.6/4500124305 for European Marine Observation and Data Network (EMODnet) - Lot2/CINEA/2024/OP/0006 (Geology).

How to cite: Kaskela, A. M., Kihlman, S., Kotilainen, A. T., Wasiljeff, J., and network, E. G.: EMODnet Geology continues to advance marine geological data for Europe , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10753, https://doi.org/10.5194/egusphere-egu26-10753, 2026.

Building implicit 3D geological models requires the detailed integration of diverse data sources, including legacy drill logs, technical reports, and stratigraphic descriptions. While this process is fundamental to understanding the subsurface, the manual translation of unstructured text into quantitative model inputs is a time-intensive task. Large Language Models (LLMs) offer promising capabilities to assist in processing data presented as text, but their application requires rigorous control to ensure geological validity. We present an ongoing research project developing a Human-in-the-Loop (HITL) workflow that leverages uses a collaborative human-AI approach to structure raw descriptions into inputs that will be used for implicit modeling.

The proposed workflow grounds the LLM in a formal Axiom-Based reasoning framework designed to minimize hallucinations and ensure consistency. The process begins with entity extraction, where the LLM parses depths and lithological descriptions from raw logs, followed by an axiomatic reasoning phase where units are categorized based on standardized rules (e.g., the Lithotectonic Framework). Crucially, the workflow integrates a dedicated validation interfaces that empowers geologists to go beyond simple verification. Experts use this environment to contextualize interpretations, test different stratigraphic hypotheses, and inject external knowledge such as fault definitions or regional correlations, before the structured output is finalized. This effectively translates text into the specific geometric parameters and interface points required to initialize the GemPy modeling engine.

We are applying this workflow to legacy data from the Campine Basin. The objective is to demonstrate how AI can function as a reliable assistant for data structuring, potentially reducing the time required for model initialization. Our workflow shifts the priority from slow data processing to critical validation; we aim to allow geologists to focus more on conceptual definitions and uncertainty analysis rather than data management. Ultimately, this research seeks to facilitate the creation of self-updating geological models that can continuously ingest and interpret new textual data as it becomes available. 

How to cite: Welkenhuysen, K., Rodriguez, J. D., and Piessens, K.: From Unstructured Geological Data to 3D Models: A Human-in-the-Loop LLM assisted Workflow for Automated Geological Model Building, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10918, https://doi.org/10.5194/egusphere-egu26-10918, 2026.

EGU26-12149 | Posters on site | ESSI4.7

Revealing the Subsurface in 3D: A Series of Short Films Focusing on Recent Applications 

Philippe Calcagno, André Burnol, Séverine Caritg, Thomas Janvier, Simon Lopez, Marc Saltel, Anne-Sophie Serrand, Jean Fauquet, Bertrand Groc, Elsa Lievin, and Pierre Vassal

BRGM - the French Geological Survey – has launched an intriguing video series comprised of seven episodes that reveal the subsurface in 3D, offering a unique perspective on its applications across various fields. Topics include water resources, geothermal energy, natural risk, mineral resources, anthropic risk, and geological knowledge and training, along with significant insights into methodologies and tools that have been developed.

Each episode is designed to provide perspectives into how these different areas benefit from advanced geological modelling. Scenarios highlight the value of 3D approach both to describe geology and as a framework for simulating real-world processes. The stories are narrated from the perspective of practical applications, which makes them accessible and engaging for viewers. The collaboration with L’Esprit Sorcier TV enhances the production quality and ensures that complex information is presented in an engaging and accessible manner. Viewers can expect to see a blend of expert insights, practical applications, and captivating visuals, making the content both informative and enjoyable.

The episodes provide an essential resource for scientists, students, professionals and stakeholders in relation with the presented topics, and anyone interested in expanding their understanding of geology. By delving into real-world applications and contemporary issues, this series provides perspectives on how geological knowledge can inform better decision-making in various sectors.

Don’t miss the opportunity to explore these engaging episodes and renew your view of subsurface geology and its implications in our everyday lives.

This engaging series is freely available in French language with English subtitles on the BRGM’s YouTube channel:
https://www.youtube.com/playlist?list=PLfgMUGQz1vBPClcglLDF74GZrJQ0u6qrA.

Selection of references for the applications depicted in the series; more are available in the end credits of each episode:

  • Audion, A.S. BRGM report BRGM/RP-62718 (2013)
  • Burnol, A. et al. Remote Sensing 15, 2270 (2023) doi: 3390/rs15092270
  • Calcagno, P. et al. Phys. Earth Planet. Inter.171, 147-157 (2008) doi: 1016/j.pepi.2008.06.013
  • Courrioux, G. et al. 17th IAMG Conf proc. pp. 59-66 (2015)
  • Janvier, T. BRGM report BRGM/RP-73278 (2023)
  • Mas, P. et al. Sci Data 9, 781 (2022) doi: 1038/s41597-022-01876-4
  • Saltel, M. et al. Hydrogeol J. 30, 79-95 (2021) doi: 1007/s10040-021-02410-3

How to cite: Calcagno, P., Burnol, A., Caritg, S., Janvier, T., Lopez, S., Saltel, M., Serrand, A.-S., Fauquet, J., Groc, B., Lievin, E., and Vassal, P.: Revealing the Subsurface in 3D: A Series of Short Films Focusing on Recent Applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12149, https://doi.org/10.5194/egusphere-egu26-12149, 2026.

EGU26-12405 | Posters on site | ESSI4.7

EMODnet Seafloor Geology: The wavy cruise towards a hierarchical, machine-readable geomorphology vocabulary 

Kristine Asch, Anett Blischke, Verner B. Ernsten, Bartal Hojgaard, Teresa Medialdea, Lis Mortensen, Dimitris Sakellariou, Paul Heckmann, Maike Schulz, Alexander M. Müller, and the EMODnet Geology network

The European EMODnet Geology project started in 2009. One of its aims is to provide geological map data of the European seas, harmonised as far as possible and made available according to FAIR data principles.

The EMODnet Geology Workpackage “Seafloor Geology” is not only compiling map layers of the geology of the seafloor (Quaternary and pre-Quaternary but is also mapping layers of the geomorphology of the European seas and beyond. Semantic and geometric harmonisation is essential to understand geological information across administrative (EEZ) boundaries. The main method to provide semantically harmonised data layers is common and agreed upon terms to describe a unit: a vocabulary.

To describe the characteristics of the seafloor geology, the vocabularies of the European INSPIRE Directive Data Specifications Geology (INSPIRE Thematic Working group Geology 2013) could be applied to describe the age, lithology and genesis (event environment, event process) of the marine geology.

While the INSPIRE vocabularies are comprehensive, they nevertheless lack terms to describe the marine geomorphological features. EMODnet Geology fills that gap and is developing hierarchical scientific vocabularies for marine geomorphology to describe the concepts to which geometrical descriptions (lines and polygons) can be linked. This controlled vocabulary consists of a hierarchical, machine-readable list of terms and definitions needed to describe the European seafloor geomorphological units.

The process to set up vocabularies for the marine domain faces considerable challenges, such as:

  • Finding suitable terms and definitions
  • Avoiding duplication
  • Agreeing internationally on the terms and description
  • Coping with obsolete and/or strictly regional terms
  • Considering multiple hierarchies

The presentation demonstrates the project’s approach to build pan-European applicable vocabularies to describe marine geomorphological features and presents use cases for its application.

How to cite: Asch, K., Blischke, A., Ernsten, V. B., Hojgaard, B., Medialdea, T., Mortensen, L., Sakellariou, D., Heckmann, P., Schulz, M., Müller, A. M., and network, T. E. G.: EMODnet Seafloor Geology: The wavy cruise towards a hierarchical, machine-readable geomorphology vocabulary, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12405, https://doi.org/10.5194/egusphere-egu26-12405, 2026.

EGU26-13012 | ECS | Posters on site | ESSI4.7

Unravelling a fault-related footwall canyon feature along the Roer Valley Graben System, the Netherlands 

Selçuk Aksay, Maryke den Dulk, Johan ten Veen, and Susanne Nelskamp

The sedimentary basin fill of the Cenozoic Roer Valley Graben System (the Netherlands) has gone through multiple phases of tectonic deformation during the Alpine orogeny, resulting in a variety of extensional and compressional structures, syn-tectonic sedimentary features and a complex and multidirectional fault pattern. The characteristics of these features, such as lithological properties, associated faults and their geometries, are crucial in geological investigations that focus on energy transition studies and/or water management. The present study seeks to enhance the geological understanding of a complex syn- and post-kinematic sedimentary feature, resembling a canyon-shaped collapse structure that formed on a relay ramp along the northern graben shoulder. Particular emphasis will be on methodology, mapping results and understanding the role of inherited faults on its development.

Since the late twentieth century, the Geological Survey of the Netherlands (GDN-TNO) has played an important role in advancing scientific understanding of the country’s subsurface geology. A major accomplishment of the GDN-TNO is the creation of comprehensive, country-wide subsurface models, using numerous 2D and 3D seismic surveys of various vintages as well as a substantial number of exploratory wells and more recently the results of the SCAN (Seismic Campaign for Accelerating Geothermal Energy) program.

Past and recent systematic inspection of this legacy data of the GDN enables us to examine both the geometry (i.e. the shape and spatial arrangement) and mechanisms of faults and associated specific sedimentary features, such as hanging-wall collapse and accretionary channel infill structures, as well as a plausible sedimentary wedge downslope of the hanging wall. Combining this with the results of our subsurface geological models, we present the potential relevance of inherited tectonics and fault reactivation on the development of these syn- and post-kinematic sedimentary features in the subsurface of the Netherlands. 2D seismic data may not always be sufficient to understand the fault orientation and length. To address these challenges and improve the accuracy of our geological modelling approach, we will incorporate and present our findings from adjacent 3D seismic datasets combined with conceptualized tectonic diagrams and real world analogues.

How to cite: Aksay, S., den Dulk, M., ten Veen, J., and Nelskamp, S.: Unravelling a fault-related footwall canyon feature along the Roer Valley Graben System, the Netherlands, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13012, https://doi.org/10.5194/egusphere-egu26-13012, 2026.

When large areas of the UK were mapped over 100 years ago priority was given to identification of mineral resources. Many such ‘drift’ maps therefore are not consistent with modern scientific understanding, nor do they reflect current stakeholder interests. Surface and groundwater flooding represent a major hazard to homes, infrastructure, and land management across the Tweed catchment. Recent work by BGS Groundwater has indicated that slope deposits are far more widespread than previously identified and play a significant role in groundwater connectivity. Updating the superficial geology map across the ~5000 km² catchment is therefore critical for improving flood forecasting, and the design of a major baseline monitoring project, the Flood-Drought Research Infrastructure funded by NERC. 

The Tweed Mapping Project applies spatial Random Forest models using DTM derivatives at 25 m resolution to predict twelve different deposit classes (e.g. till, alluvium, regolith, talus). Model training data are derived from detailed mapping surveys dated 2005, 2009 and 2012.   

Initial results indicate that slope deposits have been under-mapped, with till being the dominant deposit predicted. Both over and under-sampling are a significant issue; sample adjustment methods are unable to compensate. Minor deposits are therefore under-represented in model outputs. 

Model outputs have been checked in the field in Cheviot, Tweedsmuir and Galashiels areas during 2025. Geomorphological mapping, section logging, and bulk sampling of deposits are being used to provide up-to-date training data to enable more reliable and accurate model predictions. Outstanding issues include: (i) the absence of LiDAR data away from major river channels and settlements, (ii) over-representation of specific field observations, and (iii) limited geomorphological inputs to the model.  

How to cite: Roberson, S.: The Tweed Mapping Project: machine learning methods for rapid Quaternary mapping, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14282, https://doi.org/10.5194/egusphere-egu26-14282, 2026.

EGU26-14705 | ECS | Orals | ESSI4.7

Integrating geology-informed constraints into machine learning–based borehole interpretations for subsurface modelling: A case study from the Netherlands 

Sebastián Garzón, Willem Dabekaussen, Eva De Boever, Freek Busschers, Siamak Mehrkanoon, and Derek Karssenberg

Geological mapping and 3D subsurface modelling require consistent geological interpretations across large datasets with heterogeneous spatial coverage and information density. In the Netherlands, several subsurface models rely heavily on borehole lithological descriptions to map lithostratigraphic units and geological structures. Automated interpretation approaches based on machine learning (ML) are being developed to transfer expert geological interpretations to previously unseen boreholes, thereby increasing the number of interpreted boreholes that can be incorporated into subsurface models. However, existing neural network-based approaches for borehole interpretation often struggle to consistently respect the stratigraphic and spatial relationships derived from expert geological knowledge.  In practice, automated interpretations can produce stratigraphically inconsistent successions, with younger units incorrectly predicted to occur below older ones, or units appearing outside their known regional extent. This limitation stems from ML training objectives that prioritise local classification accuracy (e.g., categorical cross-entropy loss) over regional geological plausibility. 

To improve the geological plausibility of ML-generated interpretations, we introduce geology-informed loss functions that account for stratigraphic consistency and the spatial extent of lithostratigraphic units. The proposed loss functions are combined with a standard classification loss during model training on expert-interpreted boreholes and evaluated on previously unseen boreholes drawn from the same national dataset, comprising 7,500 boreholes in total. By varying the relative weight of each loss function during model training, we found that ML models trained with a combination of geology-informed loss functions and standard categorical cross-entropy substantially reduce geologically implausible stratigraphic transitions, increasing the proportion of stratigraphically consistent transitions from approximately 90% to up to 95%, and making fewer predictions of lithostratigraphic units outside their known regional extent.  These improvements in geological plausibility do not lead to a noticeable change in overall classification accuracy (≈ 75% across different loss-weight combinations). Incorporating geology-informed training objectives, therefore, provides a practical way to improve the plausibility and consistency of automated borehole interpretations used in large-scale subsurface modelling workflows.

How to cite: Garzón, S., Dabekaussen, W., De Boever, E., Busschers, F., Mehrkanoon, S., and Karssenberg, D.: Integrating geology-informed constraints into machine learning–based borehole interpretations for subsurface modelling: A case study from the Netherlands, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14705, https://doi.org/10.5194/egusphere-egu26-14705, 2026.

Accurate characterisation of planetary surface topography and reflectance at metre and sub-metre scales is critical for geological interpretation, understanding regolith processes, and supporting surface exploration. We present LUMOS (LUminosity-constrained Multi-angular Observation Super-resolution), a physics-based framework for the joint reconstruction of super-resolution digital elevation models (DEMs), spatially varying surface reflectance, and uncertainty estimates from multi-angular orbital imagery. The method overcomes key limitations of classical shape-from-shading approaches, which typically assume Lambertian reflectance and provide no uncertainty quantification.

Figure 1 Area of the reconstructed terrain centred on the Apollo 15 landing site.
(a) LOLA elevation map at its native resolution. (b) LUMOS-derived DEM shown in nadir view.
(c,d) Oblique views of the LUMOS DEM.

LUMOS formulates surface reconstruction as a Bayesian inverse problem that explicitly couples topography and photometry. Observed radiance is modelled using a non-Lambertian, kernel-driven bidirectional reflectance distribution function (BRDF), adopting the Ross–Thick Li–Sparse (RTLS) formulation to represent isotropic, volumetric, and geometric scattering effects. This enables physically consistent treatment of anisotropic regolith scattering, shadowing, and viewing-geometry dependence. A low-resolution laser altimetry DEM is incorporated as a prior to constrain long-wavelength topography, while fine-scale surface structure is recovered from photometric variations across multiple illumination and viewing angles. The coupled system is solved efficiently using a Sylvester-equation-based formulation, avoiding empirical tuning parameters and allowing uncertainties in image radiance and prior information to propagate into the final products.

Figure 2 Slope uncertainty map. Uncertainty increases in shadowed regions and where viewing geometry is limited.

We demonstrate LUMOS using multi-angular LROC NAC observations of the Apollo 15 landing site. The reconstructed DEM achieves a spatial resolution of 0.53 m/pixel, corresponding to the native resolution of the NAC imagery and representing more than a two order of magnitude increase in sampling density relative to the Lunar Orbiter Laser Altimeter (LOLA) prior. Large-area comparisons show that the LUMOS DEM preserves consistency with LOLA-derived long wavelength trends while resolving fine scale morphological features, including small craters, subtle relief variations, and local undulations unresolved in altimetric data. Detailed views further illustrate surface continuity and the absence of illumination correlated artefacts.

Beyond elevation, LUMOS retrieves spatially resolved reflectance parameters and provides pixel-wise uncertainty estimates for both elevation and slope. Derived slope maps reveal metre-scale variations sensitive to reflectance modelling assumptions, with Lambertian-based reconstructions exhibiting systematic biases relative to the RTLS solution. These differences have implications not only for operational assessments, such as landing-site hazard evaluation, but also for scientific interpretation of small-scale morphology, regolith roughness, and slope-controlled geological processes.

The LUMOS framework is constrained primarily by observational resolution rather than algorithmic limitations. While the present results are bounded by the resolution of available NAC data, the methodology directly benefits from higher-resolution, multi-angular observations. As such, LUMOS constitutes a cornerstone of the ESA Máni mission (Phase A), which aims to acquire dense multi-angular imagery at spatial resolutions of approximately 0.17–0.2 m/pixel. Applied to Máni data, LUMOS is expected to further enhance topographic fidelity, reflectance characterisation, and uncertainty-aware surface mapping.

How to cite: Fernandes, I., Mosegaard, K., and Schmidt, F.: Physics-Informed Joint Super-Resolution Topography and Reflectance Inversion From Multi-Angular Planetary Imagery — The LUMOS Framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15157, https://doi.org/10.5194/egusphere-egu26-15157, 2026.

Geologic maps are undergoing a paradigm shift from static illustrations to dynamic, intelligent knowledge platforms. Traditionally, geologic maps have served specialized fields in fixed image formats. However, their closed information systems, weak interactivity, and difficulties in cross-domain integration have limited the full release of their value. In recent years, advancements in Artificial Intelligence (AI) and Multimodal Large Language Models (MLLMs) have provided a new pathway for the digital reconstruction and intelligent application of geologic maps.

In collaboration with Microsoft Research Asia, our project team has proposed and constructed an open, extensible intelligent platform for geologic map comprehension and service. This platform is based on high-quality digitized geologic map datasets and utilizes MLLMs to achieve semantic parsing, knowledge association, and natural language interaction with geologic maps. The established platform not only supports the accurate identification and extraction of fundamental map elements (such as legends, lithology, and structures) but also enables the following multi-level application scenarios:

  • Intelligent Interaction and Q&A: Users can directly query geologic information using natural language—for example, "What faults are distributed in this area?" or "What is the formation age of a certain rock layer?" The system generates accurate answers by integrating graphic-text information and domain knowledge.
  • Scientific Research and Educational Tools: It provides an interactive, annotatable interface for geologic map learning, supporting classroom teaching, professional training, and interdisciplinary research.

The platform is supported by core technologies including the first-ever multimodal benchmark for geologic map understanding, GeoMap-Bench, and the intelligent agent framework, GeoMap-Agent, which significantly outperforms general-purpose vision-language models on multiple tasks. Geologic maps are no longer merely "base maps" or "reference maps"; they have become an intelligent knowledge base connecting geologic data, professional expertise, and multi-domain applications.

Looking ahead, the geologic map platform will further integrate real-time sensor data, remote sensing information, and socio-economic factors, driving the earth sciences towards a new era characterized by openness, collaboration, and intelligence. It is poised to play a central role in scientific discovery, engineering safety, sustainable resource utilization, and the building of societal resilience.

How to cite: Song, Y. and Huang, Y.: The Geologic Map Intelligent Platform: AI-Enabled Digital Transformation and Building a Multimodal Application Ecosystem, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16524, https://doi.org/10.5194/egusphere-egu26-16524, 2026.

EGU26-16620 | ECS | Posters on site | ESSI4.7

Geodiversity and Seafloor Substrate Mapping to Support Marine Management in the Åland Islands, Baltic Sea – Results from the Biodiversea LIFE IP Project  

Satu Virtanen, Sami Jokinen, Anu Kaskela, Meri Sahiluoto, Antti Sainio, and Nikolas Sanila

The Biodiversea LIFE IP project (2021–2029) is Finland’s largest coordinated initiative to safeguard biodiversity of the Baltic Sea and promote the sustainable use of its marine environment. The Geological Survey of Finland GTK conducted marine geological surveys around the Åland Islands to support informed marine management and conservation.

The work combined seismo-acoustic methods, including subbottom profiling, multibeam echosounder, and sidescan sonar, with extensive surface sediment sampling. These surveys produced detailed information on seabed geodiversity, sediment distribution, and substrate types, indicating a highly geologically diverse seafloor around the Åland Islands. The resulting datasets improve our understanding of the physical and geological properties of the seafloor, which form the foundation for biodiversity and habitat development in the area.

We describe the geological setting, the applied survey methods, and the contribution of geoscientific information to multidisciplinary marine conservation planning. The results highlight the importance of geological data for understanding marine ecosystems and for supporting science-based decision-making in marine management.

The Biodiversea LIFE IP project is coordinated by Metsähallitus. In addition to GTK, project partners include the Baltic Sea Action Group (BSAG), Finnish Environment Institute (SYKE), Ministry of the Environment, Natural Resources Institute Finland (Luke), Turku University of Applied Sciences, Åbo Akademi University, and the Åland Provincial Government. The project has received funding from the LIFE Programme of the European Union. The material reflects the views of the authors, and the European Commission or CINEA is not responsible for any use that may be made of the information it contains.

How to cite: Virtanen, S., Jokinen, S., Kaskela, A., Sahiluoto, M., Sainio, A., and Sanila, N.: Geodiversity and Seafloor Substrate Mapping to Support Marine Management in the Åland Islands, Baltic Sea – Results from the Biodiversea LIFE IP Project , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16620, https://doi.org/10.5194/egusphere-egu26-16620, 2026.

EGU26-16704 | Orals | ESSI4.7

An international vocabulary for anthropogenic deposits to improve geological mapping and modelling 

Cecile Le Guern, Jeroen Schokker, Urszula Stępień, Jan Walstra, Paul Heckmann, Kristine Asch, and Hans-Georg Krenmayr

Anthropogenic deposits are widespread in various environments. Some consist of displaced natural materials, and others of anthropogenic (human-made) materials, or they contain a mixture of both. Human-made materials include demolition materials (such as concrete), industrial waste and by-products (e.g., slags), mining residues, and domestic waste. Excavated soils and dredged sediments are examples of displaced natural materials. Anthropogenic deposits can be linked to hazards like geotechnical instability and contamination, potentially resulting in health and environmental risks (e.g., to soil, water, biodiversity, stable site foundation) with associated economic, legal, and social impacts. On the other hand, some deposits can represent valuable resources. Former mining or urban deposits, for example, may contain extractable amounts of critical raw materials (CRM). They may also be reused during land development or hold geoheritage value, such as in the case of prehistoric burial constructions. However, our knowledge of anthropogenic deposits is still poor. Improving their representation in geological maps and models is therefore crucial. Against this background, the European GSEU project is developing a set of coordinated vocabularies to standardise the describtion of anthropogenic deposits.

Existing national and international vocabularies and definitions were collected and compiled into a comprehensive list. In parallel, a conceptual data model was developed as a basis to systematically organise and classify the terms. This allowed establishing hierarchical lists of terms to structure the vocabularies and provide space for additional information on anthropogenic deposits, such as their purpose and geometry. A coherence and consistency check between the various vocabulary lists was conducted to ensure alignment across all terms. Real-world examples (use cases) of anthropogenic deposits were used to test the effectiveness and relevance of the vocabularies.

A “lithology-based” approach was chosen to describe anthropogenic deposits. The terms for displaced natural materials originate from the lithology vocabulary, which is being compiled in parallel within the GSEU project. For human-made materials an existing classification from materials science is used, with some adaptations and additions. The set of vocabularies includes additional attribute lists linked to the origin of the materials present in the deposit, the original purpose of the deposit, the shape of the deposit, as well as its environment (natural, anthropic). The selected use cases cover various situations (former landfill, redevelopment area, archaeological site, mine tailing, industrial residue, reclaimed land) in several environments (urban, rural, mining, industrial, coastal and fluvial environment). The associated environmental and social issues include sanitary aspects linked to soil pollution, surficial and groundwater quality, geotechnical stability (vulnerability to collapse, landslide, ground subsidence, erosion, etc.), and cultural heritage.

The developed scientific vocabularies dedicated to anthropogenic deposits are designed for use with multiscale spatial geological datasets in both 2D and 3D formats. These can be integrated within geological maps and 3D models to support various applications, such as spatial planning, area development, resource extraction, and risk management. The final hierarchical lists of terms will be delivered for implementation in EGDI, the European platform to share, integrate and access geological data.

How to cite: Le Guern, C., Schokker, J., Stępień, U., Walstra, J., Heckmann, P., Asch, K., and Krenmayr, H.-G.: An international vocabulary for anthropogenic deposits to improve geological mapping and modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16704, https://doi.org/10.5194/egusphere-egu26-16704, 2026.

EGU26-16761 | Posters on site | ESSI4.7

Harmonized seabed substrate datasets and insights from EMODnet Geology 

Susanna Kihlman, Anu Marii Kaskela, Aarno Tapio Kotilainen, and Joonas Wasiljeff and the EMODnet Geology network

Human activities and increasing pressures on marine and coastal environments have highlighted the need for accessible, reliable, and harmonized marine information. Since 2009, the EMODnet (European Marine Observation and Data Network) Geology project has been collecting and harmonizing geological data from all European sea areas, and Caspian and Caribbean Seas. This work, carried out in collaboration currently with 39 partners and subcontractors, has focused on creating cross-boundary, multiscale datasets from scattered and heterogeneous sources for diverse applications.

Seabed substrate is one of the main parameters describing marine environment. Project addresses seabed substrates and related characteristics and over the years, EMODnet Geology has developed several data products such as harmonized seabed substrate maps based on sediment grain size, sedimentation rate datasets, and a seabed erosion index derived from literature. These products have evolved, incorporating additional attributes like seabed surface features (e.g., seagrass meadows, bioclastic bottoms, ferromanganese concretions) and confidence assessments to improve usability and usefulness.

Building on this foundation, the latest phase of the project introduces new data additions to the data catalogue. One of the additions to complement existing sedimentary information is organic carbon data, which is essential for understanding carbon cycling, climate regulation, and ecosystem health. At the same time, we have initiated work on identifying and classifying sedimentary environments within national datasets to better capture dynamic processes and environmental variability, to support modelling and interpretation of marine systems. Basic work on these new datasets is underway, and we are in the early stages of method development to integrate this new information.

After more than fifteen years, EMODnet Geology has established itself as one of the main providers of publicly available, harmonized in situ seabed data. Continued development, both updating existing products and introducing new datasets, will ensure the relevance of this information for addressing future challenges in marine and coastal management and research.

The EMODnet Geology project is funded by The European Climate, Environment and Infrastructure Executive Agency (CINEA) through contract CINEA/EMFAF/2024-25/3.6/4500124305 for European Marine Observation and Data Network (EMODnet) - Lot2/CINEA/2024/OP/0006 (Geology)

How to cite: Kihlman, S., Kaskela, A. M., Kotilainen, A. T., and Wasiljeff, J. and the EMODnet Geology network: Harmonized seabed substrate datasets and insights from EMODnet Geology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16761, https://doi.org/10.5194/egusphere-egu26-16761, 2026.

EGU26-16992 | ECS | Posters on site | ESSI4.7

PyMeshIt: An Open-Source Python modelling engine in PZERO and a standalone software for Conforming Tetrahedral Mesh Generation 

Waqas Hussain, Mauro Cacace, Andrea Bistacchi, and Riccardo Monti

Three-dimensional geological models can be used to simply return a visual representation of complex subsurface structures; however, when they are used to define the geometry and properties of bodies used in downstream numerical simulations (e.g., geothermal, geomechanical, and/or fluid flow simulations), their application is limited by the difficulty in generating computational meshes that preserve the geological topology. In particular, intersecting faults, unconformities, and stratigraphic contacts present challenges because numerical simulations require watertight models, with consistently defined surface intersections that do not pose any ambiguity whatsoever regarding the attribution of a certain 3D region to a given closed volume. As such, to generate watertight models and meshes is the critical step that quite often hinders practical downstream applications of geological models.

We present PyMeshIt (https://github.com/waqashussain117/PyMeshit), a pure-Python open-source modelling engine that addresses this bottleneck by automating the generation of conforming tetrahedral meshes from complex geological interpretations. PyMeshIt is available both as a standalone application and as an integrated meshing engine within the PZero geological modelling platform (https://github.com/gecos-lab/PZero), supporting a wide range of geomodelling workflows without imposing assumptions on downstream simulations.

PyMeshIt implements an interactive multistage workflow that supports point clouds, triangulated surfaces, well trajectories, and model boundaries. The central focus of the software is the explicit preservation of geological/topological relationships during meshing. Surface-surface and polyline-surface intersections are computed automatically, producing intersection polylines that trace fault cutoffs, unconformity truncations, and formation contacts. Locations where three or more geological features converge are identified as triple points and are retained as topological constraints. These intersections and junctions are used as constraints during surface reconstruction and volumetric meshing to ensure that element faces align with the geological boundaries in the final mesh.

Material regions are assigned through interactive seed-point placement, allowing tetrahedral volumes to be consistently attributed to geological units. The output formats include VTK/VTU for visualisation, STL for CAD applications, and EXODUS II for numerical modelling frameworks. When used within PZero, PyMeshIt directly accesses the geological model entities without intermediate file conversion, preserving pre-triangulated geometries and allowing the possibility of creating geological interpretations within a single framework, thereby ensuring a complete open-source workflow from geological interpretation and modelling to meshing.

How to cite: Hussain, W., Cacace, M., Bistacchi, A., and Monti, R.: PyMeshIt: An Open-Source Python modelling engine in PZERO and a standalone software for Conforming Tetrahedral Mesh Generation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16992, https://doi.org/10.5194/egusphere-egu26-16992, 2026.

Geological mapping of the 25 × 25 km Torma 1:50,000 map sheet is challenged by:

  • the crossing of the Ordovician–Silurian carbonates boundary,
  • Devonian siliciclastic rocks overlapping parts of the area,
  • alternating Quaternary cover of primarily glacial origin.

The bedrock geology is further complicated by a north–south oriented facies transition within the Ordovician succession, from relatively shallow carbonate facies towards more deep facies. Drilling-based constraints are limited: historical borehole information is sparse, descriptions too general, and locally conflicting, while available cores are of insufficient quality for reliable stratigraphic control. To improve geological understanding within restricted budgets, we selected towed time-domain electromagnetics (tTEM) as a rapid data acquisition method for regional-scale mapping.

We report results from over 100 km of tTEM profiling, acquired predominantly with a 3 × 3 m 1-turn transmitter configuration. Data were collected primarily along unpaved roads, smaller roads, and paths, complemented by targeted measurements on selected fields. This mixed acquisition strategy produces strongly variable lateral sampling density and enables an assessment of how survey geometry and data coverage influence interpretational confidence. Road-based acquisition enables rapid spatial coverage but with lower effective lateral resolution compared to field grids, and introduces additional noise and artefacts related to infrastructure. While mapped utilities can be considered during planning, abandoned cables and scattered ferrous objects (e.g., signs, posts, culverts) create intermittent interference that must be identified and mitigated during processing and interpretation.

Preliminary results do not support the presence of a large buried valley previously inferred from multiple older (now lost) drill cores; this is consistent with nearby seismic lines at the reported locations. Across most of the area, tTEM provides the most continuous constraint on Quaternary thickness, and field-based segments resolve internal variability sufficiently to discriminate between different Quaternary units with higher resistivity contrasts, providing a new tool for Quaternary mapping in Estonia as well. Bedrock-related contrasts are detectable in parts of the survey area, but not consistently across all geological situations. Thickness estimates of the uppermost bedrock units correlate well with drill-core control where available, yet indicate substantially higher spatial variability elsewhere than expected from existing conceptual models.

The dataset highlights the areas where drilling remains necessary to resolve key ambiguities, while providing a markedly improved basis for defining regional trends and constructing geological models and updated maps in a complex carbonate–siliciclastic setting.

How to cite: Ani, T. and Kuusk, C.: How much tTEM coverage is enough to trust a geological interpretation? Evidence from mixed road/field-based data acquisition across the Ordovician–Silurian boundary and Devonian cover in Estonia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17117, https://doi.org/10.5194/egusphere-egu26-17117, 2026.

EGU26-17659 | ECS | Posters on site | ESSI4.7

Application of multispectral Sentinel-2 images for Geo-environmental terrain classification mapping based on landforms: an example of the Campo de Cartagena, SE Spain 

Indira Rodríguez, Pablo Valenzuela, Eduardo García-Meléndez, Inés Pereira, and Montserrat Ferrer-Julià

The terrain classification through Terrain Mapping Units (TMU) consists of the definition of homogeneous relief units that integrate different aspects of the natural environment (geology, geomorphology, drainage, land use, vegetation, etc.), providing a solid basis for multidisciplinary studies focused on aspects such as mining, geotechnics, natural hazard analysis and environmental assessment, among others. This approach may be of particular interest in countries that lack a comprehensive geological and geomorphological mapping infrastructure, providing a basic characterization of their main geographical, geological and environmental characteristics. Currently, the wide variety of available remote sensing products constitutes an advantage when tackling this type of cartography.

The main goal of this study is to evaluate the usefulness of freely available remote sensing products, accessible online on a global scale, for producing TMUs. To achieve this goal, a combined analysis of several remote sensing products was addressed for the Campo de Cartagena (SE Spain), a semi-arid and heavily anthropized area including the Mar Menor lagoon, the Neogene and Quaternary detrital deposits from the Campo de Cartagena plain and the surrounding mountain ranges, formed by Palaeozoic, Permian and Triassic metamorphic rocks.

Remote sensing products used are: (1) a digital elevation model – DEM with spatial resolution of 30 m, derived from the Shuttle Radar Topography Mission (SRTM, NASA), and (2) a multispectral Sentinel-2 dataset, with spatial resolutions of 10 and 20 m. On this basis, two different spatial resolution TMU maps were developed and compared to test their different capabilities for mapping purposes: (1) based on the 30 m spatial scale DEM and Sentinel-2 bands at 20 m spatial resolution, and (2) based on the 30 m spatial scale DEM and the Sentinel-2 bands at 10 m spatial resolution. Processing the DEM using a Geographic Information System – GIS resulted in hillshade, slope and flow accumulation models, which were used to characterise the main topographic features. In addition, the combination of different spectral bands and the application of digital image processing techniques enabled the identification of differences in surface composition. Based on these observations, homogeneous TMUs were delineated according to three main criteria: (1) relief, (2) drainage network and (3) surface composition variability. Accuracy analysis and validation were implemented by field-work observations and by comparing the resulting terrain classification map with the already existing geological and geomorphological maps at 1:50000 scale from the Spanish Geological Survey (IGME). This study highlights the potential of freely available remote sensing products, accessible online on a global scale for mapping TMUs in an area affected by intense agricultural and mining activities.

Acknowledgements: Research Project PID2023-150229OB-100 (HYPERLANDFORM) financed by MICIU/AEI/10.13039/501100011033 and by FEDER, UE. The participation of Inés Pereira was supported by an FPU (FPU21/04495) contract from the Spanish Ministry of Universities.

How to cite: Rodríguez, I., Valenzuela, P., García-Meléndez, E., Pereira, I., and Ferrer-Julià, M.: Application of multispectral Sentinel-2 images for Geo-environmental terrain classification mapping based on landforms: an example of the Campo de Cartagena, SE Spain, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17659, https://doi.org/10.5194/egusphere-egu26-17659, 2026.

EGU26-17722 | Posters on site | ESSI4.7

What helps and what hurts tailored AI in geological modelling: beyond the hype, evidence from data-scarce shallow geothermal modelling in Cyprus 

Bartlomiej Ciapala, Evangelos Papaefthymiou, Lazaros Aresti, Dimitris Pasias, Dimitrios Graikos, Georgios A. Florides, and Paul Christodoulides

Artificial intelligence is often expected to revolutionise geological modelling, but in practice its performance is strongly controlled by how geological information is collected, encoded, constrained, and by how well the AI workflow is tailored to the task. In this contribution we analyse what helps and what hurts AI-based geological modelling under data-scarce conditions, using shallow geothermal modelling in Cyprus as a testbed.

Within the WAGEs project on shallow geothermal energy, we compiled borehole profiles from across Cyprus, harmonising heterogeneous lithological descriptions into a simplified but consistent scheme and linking them to tectonic units and basic spatial information. Classical, off-the-shelf neural-network approaches performed poorly on this limited and noisy dataset, highlighting the vulnerability of generic architectures to inconsistent lithological classifications and incomplete metadata.

We therefore developed a tailored, sequence-based machine-learning workflow in which each borehole is encoded as a one-dimensional string combining depth-ordered lithologies, tectonic context, and location. A supervised learning algorithm was trained on existing boreholes and tested on independent control sites. In phase-one experiments, the model reached about 85% accuracy when the two top-ranked predicted lithological profiles were considered for the full borehole depth. This metrics was selected due to existing rock types that may be easily misclassified (marl-chalk) or interpreted (decayed rock at the surface – rock, soil or surface deposit). Algorithm’s skill was highest where lithological contrasts were strong, while more gradational successions remained difficult to distinguish. The model showed partial ability to infer the presence of faults from lithological patterns, while it was not designed to localise them nor supplied with relevant information.

From this case study we distil key factors that help tailored AI-based geological modelling (standardised, information-rich lithological logs; task-specific encoding that reflects geological settings; explicit tectonic context) and those that hurt it (lack of identification protocol; inconsistent rock descriptions; loss of detail during digitization). Our results indicate that robust AI-based geological modelling does not necessarily require massive datasets, as long as the available information is consistent and well structured. However, in data-scarce settings the main ceiling for AI performance is informational rather than algorithmic: more complex models add little once the underlying geological description is noisy or underspecified. In practice, tailored workflows are most powerful as tools for scenario ranking and for identifying where additional boreholes or geophysical surveys would most effectively reduce subsurface uncertainty, rather than as engines for fully automatic geological models. We conclude that the community should treat AI primarily as a tool for rapid, big-picture or illustrative geological modelling and for stress-testing geological knowledge. Its main value lies in exposing gaps in our subsurface descriptions (including quantitative uncertainty estimates), rather than providing a shortcut that can replace careful geological thinking.

How to cite: Ciapala, B., Papaefthymiou, E., Aresti, L., Pasias, D., Graikos, D., Florides, G. A., and Christodoulides, P.: What helps and what hurts tailored AI in geological modelling: beyond the hype, evidence from data-scarce shallow geothermal modelling in Cyprus, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17722, https://doi.org/10.5194/egusphere-egu26-17722, 2026.

EGU26-19869 | Orals | ESSI4.7

Filling in the white ribbon – Airborne lidar bathymetry and RGB imaging in combination with ROV video imaging and seabed sampling for seabed nature mapping in the coastal zone (Danish waters) 

Verner Brandbyge Ernstsen, Mikkel Skovgaard Andersen, Lars Øbro Hansen, Isak Ring Larsen, Nina Lei Juul Nielsen, Carlette Neline Blok, and Zyad Al-Hamdani

The shallow water nearshore area is often referred to as the white ribbon due to a low density or even a lack of data in this transition zone between land and sea. Historically, it was challenging to generate detailed 3D maps in this transition zone with the available technologies. However, emerging technologies during the last decade such as airborne lidar bathymetry (ALB) has enabled full-coverage, high-resolution seabed mapping in such environments (e.g. Andersen et al., 2017).

Seabed mapping in the shallow water coastal zone is paramount in relation to a wide spectrum of societal challenges, e.g. climate change adaptation with coastal protection in relation to storm surges and sea level rise, green energy transition with connection of offshore windfarms to land, nature restoration and protection for preserving or enhancing nature and biodiversity, and safety of critical infrastructure in nearshore areas.

We present examples of and experiences from national seabed mapping projects combining airborne lidar bathymetry and RGB imaging with ROV video imaging and seabed sampling for mapping seabed morphology, substrates and habitats in shallow water nearshore areas in Danish waters.

We demonstrate the potential of applying a combination of platforms (airborne, vessel borne and underwater) and instruments (optical and acoustical) in a multiscale remote sensing approach to acquire composite datasets tailored for seabed nature mapping in shallow water nearshore areas – filling in the white ribbon.

 

References

Andersen MS, Gergely A, Al-Hamdani Z, Steinbacher F, Larsen LR, Ernstsen VB (2017). Processing and performance of topobathymetric lidar data for geomorphometric and morphological classification in a high-energy tidal environment. Hydrology and Earth System Sciences, 21: 43-63, DOI: 10.5194/hess-21-43-2017.

How to cite: Ernstsen, V. B., Andersen, M. S., Hansen, L. Ø., Larsen, I. R., Nielsen, N. L. J., Blok, C. N., and Al-Hamdani, Z.: Filling in the white ribbon – Airborne lidar bathymetry and RGB imaging in combination with ROV video imaging and seabed sampling for seabed nature mapping in the coastal zone (Danish waters), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19869, https://doi.org/10.5194/egusphere-egu26-19869, 2026.

EGU26-20682 * | Posters on site | ESSI4.7 | Highlight

Geological and Geophysical Investigation of Grindavík, Iceland, in Response to Volcanic Activity and Fissure Movements at the Sundhnúkar Eruption Fissure 

Ögmundur Erlendsson, Magnús Á. Sigurgeirsson, Gunnlaugur M. Einarsson, Jóhann Ö. Friðsteinsson, Jón Haukur Steingrímsson, Gregory Paul De Pascale, Elisa Johanna, Catherine Rachel Gallagher, Hallgrímur Örn Arngrímsson, Steinunn Hauksdóttir, and Daniel Ben-Yehoshua

A powerful earthquake swarm related to accumulation of magma in a shallow reservoir beneath Svartsengi, on the Reykjanes Peninsula SW Iceland began in October 2023. On 10 November 2023 a large dike intrusion occurred beneath the town of Grindavík leading to the formation of a graben structure on the west side of town. Subsequently, 11 more dike intrusions have occurred along the Sundhnúkur crater row, with another graben forming on the east side of town. The maximum subsidence measured in the town is 1.5 m, and further fault movements were triggered throughout Grindavík. These events resulted in the opening of numerous fractures and caused damage to critical infrastructure. Following these events, the Icelandic Civil Protection authorities commissioned a detailed geological and geophysical investigation of the area.

A final report, alongside numerous technical memoranda, is now available, presenting the main results. One of the key outcomes of the project is a detailed fracture map of Grindavík. The map identifies seven distinct fracture zones that have been active during the ongoing unrest: Stamphólsgjá, Hópssprunga, Austurhópssprunga, Víðihlíðarsprunga, Bröttuhlíðarsprunga, Stakkavíkursprunga, and Strandhólssprunga (see:https://www.map.is/grindavik/). Stamphólsgjá is the deepest (>30 m) and widest fracture (3 m). In addition, depths greater than 20 m were measured within fractures of the Hópssprunga and Bröttuhlíðarsprunga zones. It is important to note that Stamphólsgjá and Hópssprunga are several thousand years old, and not all of the observed widening can be attributed to the current events. Historical aerial photographs show that Stamphólsgjá was already significantly open prior to the development of the town. No evidence of Austurhópssprunga, Víðihlíðarsprunga, Bröttuhlíðarsprunga, or Stakkavíkursprunga is visible on older aerial imagery, indicating that these fractures likely formed during the ongoing events. Most fractures are typically 20–60 cm wide and 1–5 m deep, while relatively few locations exhibit fractures wider than 80 cm and deeper than 8 m. It is important to consider that substantial material collapse has occurred into many fractures, and often only surface depressions and subsidence are visible, indicating the presence of open fractures beneath the surface. The investigation employed various methods, including aerial photo interpretation, LiDAR elevation measurements, ground-penetrating radar (GPR), magnetic surveys, electrical resistivity measurements, and visual inspection.

Excavations carried out in connection with road repairs provided valuable opportunities to examine several meters into the bedrock and assess its composition. These observations revealed that the upper 4–10 m of the bedrock consist of four postglacial lavas, separated by sedimentary layers and soil. No deeper hyloclastite formations from the last glacial period were observed. The youngest lava exposed at the surface is the Sundhnúkur (sh) lava (~2200 years old). Previously known fissures in Grindavík are prominent in older lava flows (>8000 years old) but are scarcely visible in Sh.

Importantly, the volcano-tectonic unrest in and around the town is ongoing, and further fracture movements may occur in the future, and existing surface fractures continue to evolve due to unconsolidated materials moving within the fractures underscoring the importance of continued monitoring.

How to cite: Erlendsson, Ö., Sigurgeirsson, M. Á., Einarsson, G. M., Friðsteinsson, J. Ö., Steingrímsson, J. H., Pascale, G. P. D., Johanna, E., Gallagher, C. R., Arngrímsson, H. Ö., Hauksdóttir, S., and Ben-Yehoshua, D.: Geological and Geophysical Investigation of Grindavík, Iceland, in Response to Volcanic Activity and Fissure Movements at the Sundhnúkar Eruption Fissure, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20682, https://doi.org/10.5194/egusphere-egu26-20682, 2026.

EGU26-20772 | Orals | ESSI4.7

Establishing the Austrian General Geological Legend (EAGLe) 

Esther Hintersberger and Christoph Kettler and the EAGLe-Team

In 2024, Geosphere Austria initiated the project EAGLe (Establishing the Austrian General Geological Legend) to develop a harmonized nationwide geological dataset at a scale of 1:50,000 by the end of 2026. The primary objective is the creation of a hierarchically structured general legend by standardizing and harmonising the lithostratigraphic terms that are used in the different map sheets. This work is carried out by regional teams with varying starting conditions: The Quaternary and Neogene teams relied on already existing comprehensive lists, such as general legend only for Quaternary lithogenetic and geomorphological terms and the stratigraphic chart description for the Cenozoic eratherm. On the other hand, for the regions with basement rocks at the surface (such as the Tauern Window and the Bohemian Massif), regional teams faced the additional challenge of establishing coherent concepts for the lithostratigraphic and lithodemic terms in the respective regions. In some cases, legend descriptions —particularly from older maps—are either ambiguous or significantly outdated, yet they represent the only available information for certain geological units. Without field surveys, these entries can only be assigned to very general geological units. A comprehensive revision and mapping of all legend descriptions is therefore not feasible at this stage; consequently, the original legend descriptions will be included in the final dataset to ensure transparency.

The data base for this compilation consists of over 25000 legend descriptions from published geological map sheets at the scale of 1:50,000, added by GeoFAST maps at the same scale (maps compiled from selected archival material without additional fieldwork), as well as regional maps, partly at a scale of 1:25,000. However, the corresponding vector datasets exhibit considerable heterogeneity in both geological content and data structure. In some cases—particularly for older maps—vector data are entirely absent. Therefore, the second major objective is to consolidate these different datasets into a unified structure and to digitize older analogue maps to close existing digital gaps. It should be noted that this initial version will not include any geometric adjustments (e.g., correction of “sheet boundary faults”).

The first version of the integrated dataset, incorporating the preliminary general legend as far as possible, will be published on the Tethys Data Repository (www.tethys.at) by the end of 2026 and will be made publicly accessible via the Geosphere Austria web service (www.maps.geosphere.at). An additional metadata layer will provide information on the quality of the underlying published sources.

How to cite: Hintersberger, E. and Kettler, C. and the EAGLe-Team: Establishing the Austrian General Geological Legend (EAGLe), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20772, https://doi.org/10.5194/egusphere-egu26-20772, 2026.

EGU26-21060 | ECS | Posters on site | ESSI4.7

A New High-Detail, Color Vision Deficiency-Friendly Geological Map of the Orientale Basin (Moon) 

Yelena Caddeo, Giacomo Nodjoumi, Piero D'Incecco, and Gaetano Di Achille

The Orientale Basin, centered at ~19°S, ~93°W, is one of the most characteristic features on the surface of the Moon. Constituted by three concentric rings, the largest of which is between 930 and 950 km in diameter, this multi-ring basin is one of the youngest large impact basins on the Moon (Orientale is estimated to date back ~3.81 Ga) and one of the best-preserved large basins in the entire Solar System. Inside, its central depression hosts a relatively thin infilling of dark, smooth material interpreted as a mare basalt, whilst outside the outermost ring an ejecta blanket drapes the surrounding topography sometimes reaching over 1,400 km from the center of the basin. Throughout the years, the importance of the Orientale Basin has led to the creation of several geological maps at various scales, none of which, however, a scale greater than 1:200,000. Additionally, these maps never try to put together the two main methodological approaches adopted internationally up to this point at global scale for the Moon. Our work tries to bridge this gap by presenting a new medium-to-large-scale (1:118,000) geological map of both the inner and outer facies which makes use of a combination between a traditional planetary geological scheme and a more morphometric criterion.

The map was created with the latest long-time stable release of QGIS (vrs. 3.40) mainly using the 59 m/px resolution Lunar Orbiter Laser Altimeter (LOLA)-Kaguya Shaded Relief and the 59 m/px resolution LOLA-Kaguya DEM. These two datasets, only covering latitudes within ±60° were utilized to distinguish the different units and subunits based off their general morphology, textures, and locations, but also to identify the structures. The 100 m/px resolution, grayscale mosaic of the Lunar Reconnaissance Orbiter Wide Angle Camera (LROC-WAC) and the 118 m/px LOLA elevation model were additionally used to make up for the missing portion of the LOLA-Kaguya datasets. The Clementine UVVIS colored mosaic (200 m/px) and the mineral abundance (wt% of Ol, Cpx, Opx, Pl, FeO) Kaguya mosaics also allowed to add a layer of information regarding differences in composition of apparently visually uniform features and terrains.

We managed to identify over 20 between units and sub-units that we grouped based on the terrain or morphological feature they are related to (e.g. crater, mare, …), and over 10 classes of structures. Our final product represents the highest resolution map available for the Orientale Basin and when compared with already existing medium-scale maps, appears to depict with more detail and accuracy its complexity. Additionally, we made use of a color vision deficiency-friendly color scheme to make the map more accessible also to that part of the population having limited sensitivity to colors.

How to cite: Caddeo, Y., Nodjoumi, G., D'Incecco, P., and Di Achille, G.: A New High-Detail, Color Vision Deficiency-Friendly Geological Map of the Orientale Basin (Moon), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21060, https://doi.org/10.5194/egusphere-egu26-21060, 2026.

This work seeks to encourage reflection and discussion on the ability and suitability of traditional classified geological maps to represent the full complexities of geology in the wild, and to consider why this is important to think about in order to serve 21st century geological mapping purposes.

The key components of traditional classified geological maps are boundary lines (in 2D), and boundary surfaces (in 3D); both of which must be ‘closed’ to form polygons or volumes representing the various classes of the map. These lines, polygons, surfaces, and volumes carry geological meaning, but what exactly?

The boundary lines that we traditionally construct geological maps from represent changes in geology, such that the geological properties on one side of the line should be different from the geological properties on the other. But, at any point along a drawn boundary line, which geological properties are changing, and by how much, and how sharply is this change occurring?

The line-based construction of the traditional classified geological map gives a restrictive view of geology. A line gives an on/off binary indication of a change in geological properties. Are we to believe that the change in geological properties is equal at all points along the perimeter of any geological polygon? Logically, the magnitude of change in geological properties (perhaps assume the sum of magnitudes of change for all properties, but it could also be for an individual property) must have a maximum somewhere along the perimeter of the polygon – perhaps this is the point that is most deserving of being represented by the line, but does the entire perimeter deserve to be represented by that line?

The use of a line to indicate a boundary also implies infinite sharpness; that the change in geological properties is instantaneous on crossing the line. Whilst this may be appropriate for faults and unconformities, lines leave us unable to fairly represent the many gradational processes that are inherent to the geological system, examples of which include partial melting, fractional crystallisation, gradational sediment deposition and diagenesis.

So where do these limitations of traditional line-and-polygon based geological mapping leave us? Representing geology in its true complexity requires mapping the individual geological properties themselves through space rather than only delineating where they significantly and collectively change. If we map the geological properties as a collection of scalar fields (as in implicit geological modelling), then all changes – big and small – for all properties are revealed in the magnitude of their gradients. Correspondingly, it appears that traditional hand-drawn geological maps attempt to approximate the sum of the magnitude of the gradients of commonly considered geological properties (age, composition, texture), albeit with a thresholded presentation owing to the line-based approach (the line doesn’t have an intensity, it either is or is not) and some necessary inconsistencies to enable polygon closure. When we consider these points, going beyond the traditional classified geological map seems crucial for progressing the completeness of our geological knowledge in the 21st century.

How to cite: Kirkwood, C.: Geological boundary dispute: reflecting on the ability of the traditional classified geological map to fully represent geology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21401, https://doi.org/10.5194/egusphere-egu26-21401, 2026.

EGU26-21538 | Orals | ESSI4.7

Technical studies for offshore energy potential, geological and environmental mapping towards support of windfarm developers' decisions 

Pedro Brito, Fátima Abrantes, Catarina Aires, Jaime Almeida, Luís Batista, Rúben Borges, Pedro Costa, Teresa Drago, Marta Neres, Vítor Magalhães, João Noiva, Dulce Oliveira, Ângela Pereira, Carlos Ribeiro, Marcos Rosa, Emília Salgueiro, Alexandra Silva, Liliana Trindade, Vasco Valadares, and Pedro Terrinha and the PRR-RP-C21-i07.01 Team

Within the framework of Portuguese policy for the energy transition and economy decarbonisation, the Portuguese Institute for the Sea and the Atmosphere (IPMA) is carrying out project RP-C21-i07.01 – Technical studies for offshore energy potential. This project, funded with 42 M€ by the European Recovery and Resilience Plan, through the component C21-REPOWEREU of the Climate Transition dimension, aims to support Portugal’s ambitions regarding energy independence and ecological transition, in the context of new geopolitical and energy market challenges.

Led by the Marine Geology and Geophysics Laboratory (SEISLAB) team at IPMA, the projects is developing studies to provide detailed data on the geological, geophysical and geotechnical properties of the seafloor, as well to define an environmental baseline. The main objective is to support offshore wind farm developers regarding engineering and financial planning, thereby providing the basis for launching auctions in offshore areas designated for windfarm development in the Portuguese Allocation Plan for Offshore Renewable Energy (PAER).

This project started in early 2024, has a duration of 2.5 years and focuses on surveys in the PAER areas of Leixões and Figueira da Foz, totalling approximately 2000 km2, located offshore the western Portuguese mainland coast, at water depths ranging from 120 m to 530 m.

Hydrographic and geophysical survey methodologies included multibeam echosounder (MBES), side scan sonar (SSS), magnetometer (MAG), two sub-bottom profilers (SBP) and multichannel ultra high-resolution seismic (UHRS) reflection data. Geotechnical methodologies included cone penetrating tests (CPT) and sedimentological and physical properties of sediments recollected with grabs and Vibrocoring (VC).

Preliminary works conducted in 2024 included desktop studies and exploratory surveys with the acquisition of approximately 2000 km of geophysical data (MBES, SBP, UHRS). Survey activities carried out in 2025 involved the acquisition of circa 15000 km of geophysical data (MBES, SSS, MAG, SBP, UHRS), 122 grabs samples, 71 VCs and 43 CPTs.

Seafloor surface characterisation relied on cartographic products derived from the MBES and SSS mosaic datasets, as well as on the identification of outcropping units from the seismo stratigraphic model calibrated with the geotechnical data. Seafloor features, including landforms and contacts were interpreted from the MBES and SSS data and validated against magnetic anomalies. These included anthropogenic features like shipwrecks, trawl marks and lost objects (e.g. fishing gear) and geological features like sorted bedforms, boulders, sinkholes and outcrops.

Sub-seafloor seismic data reveal a complex geological framework associated with the rifted margin and orogenic units. The upper units are dominated by unconsolidated sediments and polyphase channel complex events associated with sea level variations, while the lower units frequently display mass-transport deposits, extending for tens of kilometres, tectonic deformation and faulting.

Environmental analysis are based on water and sediment analytical work and on the characterisation of species communities, aiming to establish the biodiversity baseline and assess the environmental condition. Surveys were conducted in compliance with the Joint Nature Conservation Committee guidelines.

The thematic cartography resulting from these pioneering and unprecedented studies in Portugal constitutes a key asset for the development of the floating offshore wind industry, supporting the ongoing Portuguese energy transition

How to cite: Brito, P., Abrantes, F., Aires, C., Almeida, J., Batista, L., Borges, R., Costa, P., Drago, T., Neres, M., Magalhães, V., Noiva, J., Oliveira, D., Pereira, Â., Ribeiro, C., Rosa, M., Salgueiro, E., Silva, A., Trindade, L., Valadares, V., and Terrinha, P. and the PRR-RP-C21-i07.01 Team: Technical studies for offshore energy potential, geological and environmental mapping towards support of windfarm developers' decisions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21538, https://doi.org/10.5194/egusphere-egu26-21538, 2026.

At the Federal Institute for Geosciences and Natural Resources we develop a wide variety of 3D-Models of the subsurface. These models range from basin-wide structural models to small scale models of an artificial fracture.

In many cases it is important to present these 3D-Models to stakeholders or the general public. One big challenge lies in the fact that many of the spectators are not professionals in geology.  Therefore, these complex 3D-Models have to be presented in a way non-professionals can easy access and understand.

Visualizing data and models in real 3D is not only very helpful in communicating our models to the general public. It can also be very helpful during the creation of 3D-model itself. Especially in very complex models, parts of the model may obstruct the view to other parts of the model. Seeing the 3D-model in real 3D provides the modeler with a better and easier impression of complex structures in the subsurface and allows

Experience has shown, that there is not one best way of visualizing 3D-data. In contrary, the 3D-visualisation has to be chosen and adapted not only for every model, but also for every target audience.

We present several methods of 3D-visualisation, ranging from 3D-Projectors and Virtual Reality over gamification (transferring 3D-models into computer games) to using 3D-printers. For each method we will present an application and evaluate the main advances and disadvantages.

How to cite: Steuer, S.: Look at it! – Visualizing 3D geological data in real 3D, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21613, https://doi.org/10.5194/egusphere-egu26-21613, 2026.

EGU26-4 | ECS | Posters on site | HS6.5

Advanced phycocyanin detection in a South American lake using Landsat imagery and remote sensing 

Lien Rodríguez-López, David Bustos Usta, Lisandra Bravo Alvarez, Iongel Duran Llacer, Luc Bourrel, Frederic Frappart, and Roberto Urrutia

In this study, multispectral images were used to detect toxic blooms in Villarrica Lake in Chile, using a time series of water quality data from 1989 to 2024, based on the extraction of spectral information from Landsat 8 and 9 satellite imagery. To explore the predictive capacity of these variables, we constructed 255 multiple linear regression models using different combinations of spectral bands and indices as independent variables, with phycocyanin concentration as the dependent variable. The most effective model, selected through a stepwise regression procedure, incorporated seven statistically significant predictors (p < 0.05) and took the following form: FCA = N/G + NDVI + B + GNDVI + EVI + SABI + CCI. This model achieved a strong fit to the validation data, with an R2 of 0.85 and an RMSE of 0.10 μg/L, indicating high explanatory power and relatively low error in phycocyanin estimation. When applied to the complete weekly time series of satellite observations, the model successfully captured both seasonal dynamics and interannual variability in phycocyanin concentrations (R2 = 0.92; RMSE = 0.05 μg/L). These results demonstrate the robustness and practical utility for long-term monitoring of harmful algal blooms in Lake Villarrica.

How to cite: Rodríguez-López, L., Bustos Usta, D., Bravo Alvarez, L., Duran Llacer, I., Bourrel, L., Frappart, F., and Urrutia, R.: Advanced phycocyanin detection in a South American lake using Landsat imagery and remote sensing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4, https://doi.org/10.5194/egusphere-egu26-4, 2026.

EGU26-125 | ECS | Orals | HS6.5

Flood Dynamics and Frequency Mapping in the Lower Ganges Floodplain in India Using Multi-Temporal Sentinel-1 SAR Observations (2016–2024) 

Mohammad Sajid, Haris Hasan Khan, Arina Khan, and Abdul Ahad Ansari

The Ganges floodplains are among the most flood-prone regions in India, where recurrent inundations cause significant socio-economic and ecological impacts. Understanding the spatial distribution, frequency, and dynamics of flooding is essential for effective floodplain management and enhancing climate resilience. This study examines the flood frequency and spatial extent across a section of the Ganga River floodplains in Bihar, utilising multi-temporal Sentinel-1 Synthetic Aperture Radar (SAR) data spanning the period from 2016 to 2024. Flooded areas were delineated through an optimal threshold-based classification of VH-polarised backscatter images, with threshold values ranging from -19.5 dB to -22.3 dB. Annual flood extents were mapped, and an inundation frequency composite was generated to identify zones experiencing recurrent flooding. The spatial analysis revealed substantial variability in flood occurrence, with extensive inundation observed in low-lying regions. Several areas were inundated in more than 60% of the study years, indicating chronic flood exposure. The decadal analysis revealed that August and September were the peak months for flooding, with some areas remaining inundated for more than one month, which had an adverse impact on both human settlements and agricultural lands. Validation using optical satellite imagery from Sentinel-2 confirmed a 98% accuracy in the SAR-derived flood extent, reinforcing the reliability of the classification method. The temporal flood frequency analysis provides crucial insights into long-term flood dynamics and helps identify hydrologically sensitive zones. Overall, this study highlights the effectiveness of SAR-based monitoring in understanding floodplain behaviour under changing climatic and hydrological conditions, and supports improved flood hazard mapping, hydrodynamic model calibration, and sustainable flood risk management in the Ganges Basin and other monsoon-affected regions.

Keywords: Flood Inundation, Multi-Temporal, Time-Series, Flood Frequency, Sentinel-1 SAR, Ganges River

How to cite: Sajid, M., Hasan Khan, H., Khan, A., and Ansari, A. A.: Flood Dynamics and Frequency Mapping in the Lower Ganges Floodplain in India Using Multi-Temporal Sentinel-1 SAR Observations (2016–2024), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-125, https://doi.org/10.5194/egusphere-egu26-125, 2026.

Wetlands are very sensitive hydrological ecosystems that are essential for groundwater recharge, flood control, and biodiversity. Climate variability, changed river regimes, and unsustainable anthropogenic pressures are all posing new challenges to their stability. The current work evaluates the two-decade hydro-climatic dynamics of the Haiderpur Wetland (Ganga River, India) by merging optical (Landsat), radar (Sentinel-1), and gridded climate (ERA5, CHIRPS) datasets with GRACE-based groundwater anomalies. On the Google Earth Engine (GEE), processing of time-series Landsat (NDVI, NDWI, LST) and Sentinel-1 (SAR) data to monitor all-weather surface inundation and vegetation structure. To disentangle climatic and anthropogenic drivers, these remote sensing products are statistically correlated against ERA5-Land (Evapotranspiration) and CHIRPS (Precipitation) data, alongside GRACE groundwater anomalies. The findings demonstrated a considerable downward trend in pre-monsoon NDWI and wetland water distribution. This was accompanied by a significant increase in LST and an unexpected increase in NDVI. All-weather Sentinel-1 data validated the drying trend. On the other hand, 'greening' (as indicated by NDVI) in a drying environment suggests a structural shift from native wetland vegetation to more drought-tolerant or invasive terrestrial plants. The study assesses the capability of a multifaceted (optical-radar-climate) GEE strategy to quantify the individual contributions of climatic and anthropogenic factors, while also monitoring wetland development. Furthermore, these findings quantify the hydro-ecological vulnerability of major Ramsar wetlands and emphasize the vital need for coordinated water management to sustain ecosystems in the Ganga River Basin, with far-reaching implications for global wetland conservation.

Keywords: Hydrology, GRACE, Climate Change, SAR, NDVI, NDWI, LST

How to cite: Ansari, A. A., Hasan Khan, H., Khan, A., and Sajid, M.: Hydro-Ecological Vulnerability of  Ganga River Wetland (India): A Multi-Sensor Remote Sensing and GRACE-based Assessment of the Haiderpur Ramsar Site, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-147, https://doi.org/10.5194/egusphere-egu26-147, 2026.

Floods are the costliest and most frequently occurring natural disasters. One of the key factors in preventing and reducing losses is providing a reliable flood map. However, the uncertainty associated with either flood inundation model or data, specifically the Digital Elevation Model (DEM), may have adverse effects on the reliability of flood stage and inundation maps. Therefore, a systematic understanding of the uncertainty is necessary. In this study, an attempt is made to assess whether models are more susceptible to the uncertainties or the data itself. In order to do this, a SCIFRIM (Slope-corrected, Calibration-free, Iterative Flood Routing and Inundation Model) is employed, utilizing a list of DEM datasets to reconstruct the October 2024 Valencia flood event. The modelled flood extents were validated against those derived from multi-sensor remote sensing data. The Critical Success Index (CSI) was calculated to assess the agreement between observed and modelled flood extents, yielding values of 0.49 and 0.59 for October 30th and 31st, respectively, when combining SCIFRIM and Lidar-DEM. Additionally, a multi-model comparison has been performed between SCIFRIM and CaMa-Flood (Catchment-based Macro-scale Floodplain), HEC-RAS (Hydrologic Engineering Center's River Analysis System), and TUFLOW (Two-dimensional Unsteady FLOW), demonstrating its relevance in terms of outputs (flood extent and stage) and model runtime. The findings demonstrate that the proposed modeling framework offers a reliable approach for flood assessment. It has great potential to support rapid assessment and decision-making in data-scarce regions.

How to cite: Tripathi, G., Sarkar, E., and Biswal, B.: Evaluating Slope-corrected, Calibration-free, Iterative Flood Routing and Inundation Model (SCIFRIM)-based Flood Inundation against multi-satellite observation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-436, https://doi.org/10.5194/egusphere-egu26-436, 2026.

Floods are highly dynamic hazards whose spatial extent can change rapidly within hours. Timely and accurate monitoring is essential for early warning, emergency response, and post-disaster assessment. A major challenge in current Earth Observation (EO) based approaches is the difficulty of capturing the complete evolution of a flood event, including its maximum flood extent. This information is often missing due to temporal gaps in Synthetic Aperture Radar (SAR) acquisitions and cloud cover in optical imagery. Missing the peak extent limits the accuracy of impact assessments and poses challenges for applications such as parametric insurance, which depend on reliable measurements of flood magnitude. Although daily flood products exist, they are often based on large-scale multi-spectral sensors and struggle during persistent cloud cover as well as with resolution for smaller events, creating an urgent need for a more reliable method for daily flood estimation from higher-resolution SAR datasets. To address these challenges, we propose a novel deep learning framework that fuses EO-based coarse dynamic hydrometeorological data with static geospatial datasets to produce high-resolution daily flood extent maps. Our approach integrates static flood conditioning inputs, including elevation, Height Above Nearest Drainage, Urban Development Area, flow direction, Normalized Difference Vegetation Index, Normalized Difference Built-up Index, soil clay and sand content, and pre-flood SAR and multispectral imagery with dynamic hydrometeorological variables such as daily precipitation and soil moisture. The model adopts a multi-stage vision transformer architecture: encoders extract multi-level latent representations from all inputs, which are then fused using cosine similarity, normalization, and temporal attention mechanisms. A decoder reconstructs high-resolution flood extent, followed by a Gaussian filter to reduce high-frequency noise. The framework is fully supervised using the globally available KuroSiwo flood mask dataset, ensuring transferability across diverse geographic regions and climate zones. In addition, this research provides a complete data preparation workflow that converts flood mask shapefiles into standardized image patch datasets, including a modular input selection interface that removes dependence on inputs included in specific datasets, directly suitable for deep learning training, enabling straightforward implementation and practical applicability. The model is trained and evaluated across three distinct climate zones on multiple continents, demonstrating a robust capability to overcome the temporal limitations of SAR data and cloud-induced gaps in optical observations. Held-out region tests with strict geographic separation to minimize spatial autocorrelation induced data leakage, further ensure unbiased evaluation and true transferability. Preliminary tests across multiple continents yield stable performance, with cross-site metric variations remaining within approximately 5-7 percent. This study introduces the first deep learning framework for daily fine-scale flood extent mapping using purely EO data which are globally accessible, providing a scalable and transferable solution for real-time flood monitoring, disaster management, and potential applications in parametric insurance by improving flood mapping cadence and reliably estimating maximum flood extents.

Keywords: spatio-temporal fusion, vision transformer, high-resolution flood mapping

How to cite: Surojaya, A., Kumar, R., and Dasgupta, A.: DeepFuse2.0: Novel Deep Learning-based Fusion of Satellite-based Hydroclimatic Data and Flood Conditioning Factors for Daily Flood Extent Mapping, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1047, https://doi.org/10.5194/egusphere-egu26-1047, 2026.

EGU26-1092 | ECS | Posters on site | HS6.5

Cross-Biome Transferability of SAR-based Flood Mapping with Random Forests 

Paul Christian Hosch and Antara Dasgupta

Fully automated, globally applicable flood-mapping systems must earn user trust, which in turn requires systematic testing across diverse environmental conditions to understand performance stability and a clear understanding of model transferability. While some recent studies have evaluated cross-site performance of flood mapping algorithms, the cross-biome transferability of Random Forest (RF) models for SAR-based flood delineation has not yet been thoroughly evaluated. In this study, we assess how well RF classifiers trained for binary flood detection generalize across biomes using primarily Synthetic Aperture Radar (SAR) data. Our feature stack comprises 14 variables, including 9 SAR-derived features (Sentinel-1 VV and VH backscatter and associated temporal-change metrics) which provide information on the flood-induced land surface changes and 4 contextual predictors such as land cover and topographic indices which influence radar backscatter and help to reduce as well as mitigate uncertainties. Experiments were conducted across 18 flood events distributed equally amongst 6 distinct biomes: (1) Deserts and Xeric Shrublands, (2) Tropical and Subtropical Moist Broadleaf Forests, (3) Temperate Broadleaf and Mixed Forests, (4) Temperate Coniferous Forests, (5) Mediterranean Forests, Woodlands and Scrub, (6) Temperate Grasslands, Savannas and Shrublands. Model transferability is evaluated using a two-level nested cross-validation approach. First, intra-biome performance is established through an inner 3-fold Leave-One-Group-Out Cross-Validation (LOGO-CV), in which models are trained on all but one site within a biome and evaluated on the held-out site iteratively. Second, inter-biome transferability is quantified using an outer 6-fold LOGO-CV, treating each biome as a distinct group. In this setup, models are trained on all biomes except one and evaluated on all sites of the held-out biome. Classification performance is assessed using Overall Accuracy (OA), F1-score, Precision, Recall, and Intersection over Union (IoU), with all experiments repeated across 10 independent iterations to capture model structural and sampling variability.

Preliminary results on select biomes show substantial variation in inter-biome transferability. Notably, in some cases, models transferred between biomes outperform those trained within the same biome. These findings highlight the need for comprehensive biome-level transferability assessments to better understand the capabilities and limitations of RF-based flood mapping under globally diverse conditions, ultimately supporting more transparent and trustworthy flood-mapping products for end users.

How to cite: Hosch, P. C. and Dasgupta, A.: Cross-Biome Transferability of SAR-based Flood Mapping with Random Forests, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1092, https://doi.org/10.5194/egusphere-egu26-1092, 2026.

EGU26-1266 | ECS | Posters on site | HS6.5

Cross-Biome Feature Importance Stability Analysis for SAR-based Flood Mapping with Random Forests 

Parisa Havakhor, Paul Hosch, and Antara Dasgupta

Flood mapping using machine learning methods such as Random Forests (RF) requires informed feature engineering and selection. Despite feature-importance rankings across different biomes and land covers varying substantially, the stability of these feature rankings has not been evaluated specifically for RF-based flood delineation. In this study, we investigate the consistency of RF feature-importance rankings in a binary flood-classification task primarily based on Synthetic Aperture Radar (SAR) imagery. The feature stack comprises 14 variables, including 9 SAR-based features, Sentinel-1 VV and VH polarizations and their temporal-change metrics which inform the flood extent identification, and 4 contextual features such as land cover and topographic indices which provide information on backscatter uncertainties. The classification task was conducted across 18 flood events spanning six distinct biomes: (1) Deserts and Xeric Shrublands, (2) Tropical and Subtropical Moist Broadleaf Forests, (3) Temperate Broadleaf and Mixed Forests, (4) Temperate Coniferous Forests, (5) Mediterranean Forests, Woodlands and Scrub, and (6) Temperate Grasslands, Savannas and Shrublands. Three feature-attribution methods were evaluated: (1) Shapley Additive exPlanations (SHAP) provides a game-theoretic framework for feature attribution and is widely recognized for its consistency and interpretability; (2) Mean Decrease in Impurity (MDI), computed during tree growth, is the most commonly used importance metric for RF models; (3) Permutation feature importance (MDA) offers a model-agnostic approach that assesses importance by measuring the reduction in model accuracy when feature values are randomly shuffled. Both feature cardinality and feature correlation, which bias the feature rankings for these algorithms in different ways, were considered during interpretation. All experiments were repeated across 10 independent iterations to account for random variability. We first examined feature-importance rankings independently across the three sub-sample studies within each biome to establish baseline intra-biome variability, followed by quantification of inter-biome variability to assess whether feature-importance patterns transfer across different environmental conditions. Preliminary results across select biomes indicate stable rankings for SAR-based features, with VV and VH event polarizations dominating the decision boundary, while contextual descriptors, particularly terrain indices such as Height Above the Nearest Drainage, exhibit greater variability both within and between biomes. Understanding the transferability of feature-importance patterns and feature stacks across biomes is critical for developing an RF-based flood-mapping pipeline that operates reliably under diverse environmental conditions worldwide and ultimately builds user trust in the resulting products.

How to cite: Havakhor, P., Hosch, P., and Dasgupta, A.: Cross-Biome Feature Importance Stability Analysis for SAR-based Flood Mapping with Random Forests, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1266, https://doi.org/10.5194/egusphere-egu26-1266, 2026.

EGU26-1859 | ECS | Posters on site | HS6.5

Detecting Waterlogging in Agricultural Fields in Denmark using High-Resolution PlanetScope Time Series 

Jasper Kleinsmann, Julian Koch, Stéphanie Horion, Gyula Mate Kovacs, and Simon Stisen

Waterlogging in agricultural fields is the condition of temporally inundated areas driven by extreme rainfall, rising groundwater or poor drainage, and has been identified as a major issue by Danish farmers. During the inundation period, plants are deprived of oxygen which negatively affects the root development and leads to decreased yields and grain quality. Additionally, these waterlogged areas are a large source of greenhouse gas (GHG) emissions. The issue is expected to exacerbate under current climate projections through wetter winters and rising groundwater levels in Denmark. Hence, an increased understanding of the spatio-temporal dynamics of waterlogging is required to future-proof the management strategies. The research goals are three-fold: (1) to optimise the detection of waterlogging, (2) to reveal inter- and intra-annual patters across Denmark and (3) to investigate the drivers of waterlogging such as climate, topography and bio-physical conditions. We aim to detect waterlogged areas through a deep learning semantic segmentation approach utilising multi-temporal PlanetScope imagery and nation-wide high resolution elevation data. This approach requires a manually delineated reference dataset to train, validate and test the model which needs to be well-balanced spatially, e.g. covering various soil types, and temporally, e.g. including various illumination conditions. Additionally, we will experiment with various model architectures, backbones and covariate combinations to optimise the segmentation performance. Initial tests using a UNET architecture and building upon a published reference dataset by Elberling et al. (2023), show promising results and lay the foundation for the upcoming model development and extension of the existing reference data.

 

Elberling, B. B., Kovacs, G. M., Hansen, H. F. E., Fensholt, R., Ambus, P., Tong, X., ... & Oehmcke, S. (2023). High nitrous oxide emissions from temporary flooded depressions within croplands. Communications Earth & Environment, 4(1), 463.

 

How to cite: Kleinsmann, J., Koch, J., Horion, S., Kovacs, G. M., and Stisen, S.: Detecting Waterlogging in Agricultural Fields in Denmark using High-Resolution PlanetScope Time Series, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1859, https://doi.org/10.5194/egusphere-egu26-1859, 2026.

EGU26-2995 | ECS | Orals | HS6.5

SaferSat: The Saferplaces’s  Operational Sentinel-1 Toolbox for Multi-Temporal Flood Extent Mapping, Water-Depth Estimation and Impact Assessment  

Saeid DaliriSusefi, Paolo Mazzoli, Valerio Luzzi, Francesca Renzi, Tommaso Redaelli, Marco Renzi, and Stefano Bagli

Operational flood intelligence for emergency response and insurance, providing a rapid overview of impacted land, population, and economic damages, requires mapping solutions that remain reliable under adverse observational conditions and across diverse landscapes. Although Sentinel-1 SAR provides consistent global, all-weather and day-and-night coverage, automated flood extraction is challenged by speckle noise, land-cover heterogeneity, and confusion between floodwater and permanent low-backscatter surfaces. These limitations highlight the need for approaches that exploit temporal backscatter changes while maintaining global robustness and computational efficiency.

We present SaferSat, a fully automated Sentinel-1 toolbox for flood-extent mapping, water-depth estimation, and impact assessment. SaferSat is part of SaferPlaces (saferplaces.co), a global Digital Twin platform for flood risk intelligence supporting emergency response and insurance applications. Central to the framework is Pr-RWU-Net (Progressive Residual Wave U-Net), a lightweight deep-learning model with 2.6 million trainable parameters, designed to detect flood-induced backscatter changes using VV-polarized SAR imagery. The model uses a three-channel input; pre-event VV, post-event VV, and their radiometric difference, enhancing inundation sensitivity while mitigating VH instability for global deployment.

SaferSat provides end-to-end processing: automated data retrieval, multi-date flood inference, and Maximum Flood Extent generation. To reduce SAR ambiguities, it generates auxiliary layers: a vegetation mask for SAR "blind spots" and a low-backscatter anomaly mask for permanent dark features. Flood extent layers are integrated with the FLEXTH model and GLO-30 or local high-resolution LiDAR DTMs for water-depth reconstruction. The system also analyzes acquisition patterns to predict short-term revisit opportunities. Impact assessment intersects flood extents with JRC GHS-POP and ESA WorldCover datasets.

The Pr-RWU-Net model was trained on the S1GFloods dataset, containing 5,360 paired pre- and post-event Sentinel-1 GRD images across 42 flood events from 2016–2022. Binary flood masks were generated via semi-automated thresholding and expert quality control. Evaluation on the test split achieved an IoU of 90.0%, F1-score 94.6%, Recall 95.6%, Precision 93.8%, and overall accuracy 96.6%.

Operational applicability was demonstrated on three 2025 flood events: Romania, Pakistan, and France. SaferSat flood extents closely matched SAR manual driven flood references (IoU 89–92%) and CEMS products (IoU 85–88%). Water-depth estimation against a reference hydrodynamic model yielded a MAE of 34–40 cm and correlation R of 0.78–0.82. For a 260 km² flood in Romania, the full processing chain completed in ~3 minutes on a standard CPU, demonstrating suitability for rapid, large-scale deployment.

SaferSat is available globally through SaferPlaces, supporting emergency response and insurance applications. Future developments aim to enhance SaferSat globally via integration of commercial satellite data to reduce revisit time and rapid hydrodynamic modeling to address radar limitations.

How to cite: DaliriSusefi, S., Mazzoli, P., Luzzi, V., Renzi, F., Redaelli, T., Renzi, M., and Bagli, S.: SaferSat: The Saferplaces’s  Operational Sentinel-1 Toolbox for Multi-Temporal Flood Extent Mapping, Water-Depth Estimation and Impact Assessment , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2995, https://doi.org/10.5194/egusphere-egu26-2995, 2026.

EGU26-3018 | Posters on site | HS6.5

Advancing Flood Forecasting in Large River Basins Using Multi-Mission Satellite Data: the EO4FLOOD project 

Angelica Tarpanelli and the EO4FLOOD Team

Floods are among the most destructive natural hazards worldwide, causing severe impacts on human health, ecosystems, cultural heritage and economies. Over the past decades, both developed and developing regions have experienced increasing flood-related losses, a trend that is expected to intensify under climate change due to shifts in precipitation patterns and the frequency of extreme events. In many large river basins, particularly in data-scarce regions, flood forecasting remains highly uncertain because of limited in situ observations and complex hydrological and hydraulic dynamics.

EO4FLOOD is an ESA-funded project aimed at demonstrating the added value of advanced Earth Observation (EO) data for improving flood forecasting at regional to continental scales. The project focuses on the integration of multi-mission satellite observations with hydrological and hydrodynamic modelling frameworks to support flood prediction up to seven days in advance, with an explicit treatment of uncertainty.

A key outcome of EO4FLOOD is the development of a comprehensive and openly available EO-based dataset designed to support flood modelling and forecasting studies. The dataset covers nine large and hydrologically complex river basins worldwide, selected to represent a wide range of climatic, physiographic and anthropogenic conditions, and characterized by limited or heterogeneous availability of ground-based observations. It integrates high-resolution satellite products from ESA and non-ESA missions, including precipitation, soil moisture, snow variables, flood extent, water levels and satellite-derived river discharge.

Within EO4FLOOD, these EO datasets are combined with hydrological and hydraulic models, enhanced by machine learning techniques, to improve flood prediction skill and to better quantify predictive uncertainty in data-scarce environments. The project also investigates the role of human interventions, such as reservoirs and land-use changes, in modulating flood dynamics across the selected basins.By making this multi-variable EO dataset publicly available, EO4FLOOD aims to support the broader hydrological community in testing, benchmarking and developing flood modelling and forecasting approaches in challenging large-basin settings. The project provides a unique opportunity to explore the potential and limitations of EO-driven flood forecasting and contributes to advancing the use of satellite observations for global flood risk assessment and management.

How to cite: Tarpanelli, A. and the EO4FLOOD Team: Advancing Flood Forecasting in Large River Basins Using Multi-Mission Satellite Data: the EO4FLOOD project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3018, https://doi.org/10.5194/egusphere-egu26-3018, 2026.

            Water security in the Chi River Basin is critical for the agricultural economy and ecosystem stability of Yasothon Province, Thailand. However, effective spatiotemporal monitoring of water surface dynamics is frequently hindered by persistent cloud cover during the monsoon season, limiting the utility of traditional optical remote sensing. This study addresses this challenge by developing a robust Multi-Sensor Deep Learning Fusion system that integrates Synthetic Aperture Radar (SAR) and optical satellite imagery to ensure continuous observation capabilities.

            We employ a U-Net convolutional neural network architecture, selected for its high boundary precision and efficiency with limited training datasets. The model is trained on a fused six-channel input configuration, combining Sentinel-1 SAR data (weather-independent) with Sentinel-2 optical bands (RGB), augmented by the Normalized Difference Water Index (NDWI) and Normalized Difference Vegetation Index (NDVI). This multi-modal approach enhances feature extraction, allowing for the accurate differentiation of open water from floating vegetation and flooded agricultural lands in complex transition zones.

            The study analyzes the hydrological cycle of 2022, capturing distinct drought, flood, and post-flood conditions. To ensure hydrological validity, the model’s segmentation outputs are not merely visually assessed but are quantitatively validated against ground-truth water level data from the E.20A gauge station in Kham Khuean Kaeo District. By establishing a precise Stage-Area Relationship, this research demonstrates a scalable, cost-effective framework for flood risk assessment and water capital estimation, offering a resilient solution for river basin management in cloud-prone tropical regions.

How to cite: Pruekthikanee, P.: Multi-Sensor Deep Learning Fusion for Spatiotemporal Water Surface Monitoring in the Yasothon Province's Chi River Basin, Thailand, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4154, https://doi.org/10.5194/egusphere-egu26-4154, 2026.

EGU26-5752 | ECS | Orals | HS6.5

Satellite-Enhanced Flood Modelling for the Niger River Basin using a Synergy of Hydrological Modelling and Earth Observation Data 

Shima Azimi, Alexandra Murray, Connor Chewning, Cecile Kittel, Henrik Madsen, Fan Yang, Maike Schumacher, and Ehsan Forootan

Accurate water cycle representation in data-scarce and flood-prone regions like the Niger River Basin demands stronger integration between remote sensing and hydrological modelling. Spanning ten water-stressed nations, this basin faces critical challenges under climate change, requiring robust water-budget assessments to guide resilience strategies. We employ DHI’s Global Hydrological Model (DHI-GHM) to simulate key hydrological components of the regional water cycle. Model outputs for surface and root-zone soil moisture (SSM and R-ZSM) and terrestrial water storage (TWS) are systematically compared against satellite observations (GRACE/GRACE-FO and multiple soil moisture products) to identify discrepancies and enhance the understanding of regional hydrological behavior. A near real-time SSM data assimilation scheme is implemented to enhance spatiotemporal accuracy of surface and top-soil interactions, particularly beneficial in the flood-sensitive Inner Niger Delta. Post-assimilation hydrological outputs are coupled with the CaMa-Flood surface hydraulic model to simulate inundation dynamics, enabling improved flood prediction and supporting risk management. Finally, we pursue two-way coupling of hydrological and hydrodynamic models by integrating river flow–storage feedbacks to advance flood forecasting and sustainable water-resources planning. 

How to cite: Azimi, S., Murray, A., Chewning, C., Kittel, C., Madsen, H., Yang, F., Schumacher, M., and Forootan, E.: Satellite-Enhanced Flood Modelling for the Niger River Basin using a Synergy of Hydrological Modelling and Earth Observation Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5752, https://doi.org/10.5194/egusphere-egu26-5752, 2026.

EGU26-5862 | ECS | Orals | HS6.5

Refining global wetland characterization using an unsupervised, wetness-based dynamic framework 

Yang Li, Nandin-Erdene Tsendbazar, Kirsten de Beurs, Lassi Päkkilä, and Lammert Kooistra

Existing global wetland datasets and monitoring approaches emphasizepersistent inundation, while intermittent inundation and waterlogged states—especially where vegetation is present—are underrepresented or of lower accuracy. This leads to inaccurate estimates of greenhouse gas emissions from carbon-rich systems (e.g., peatlands). Meanwhile, the predominance of annual mapping limits the capture of intra-annual variability, further reinforcing these inaccuracies and obscuring sub-seasonal disturbances from human activities (e.g., shifts in rice-cropping intensity). This study presents an unsupervised, wetness-driven framework for improving global wetland monitoring that leverages earth observation data streams. For framework development, the OPtical TRApezoid Model is applied to Harmonized Landsat-Sentinel imagery to retrieve surface wetness, followed by wetland delineation using a scene-adaptive grid-based thresholding algorithm. This framework is applied to 824 globally distributed 0.1° grid cells encompassing 9,781 land-cover-labeled sites and 134 sites with daily wet–dry labels across 28 Ramsar wetlands, and validated for spatial delineation, thematic, and temporal accuracy. Comparative analysis employs Dynamic World, the first global 30 m wetland map with a fine classification system (GWL_FCS30), and the modified Dynamic Surface Water Extent algorithm (DSWE). Our framework achieved moderate spatial delineation accuracy with F1 of 0.64 (recall 0.75, precision 0.56), comparable in F1 to Dynamic World and with higher recall than DSWE and GWL_FCS30. It delivered the highest temporal accuracy (F1 0.72; precision 0.81; recall 0.64) and improved thematic accuracy for vegetated wetland, reducing omission with modest commission. The proposed wetland monitoring framework enables more accurate targeted policy interventions.

How to cite: Li, Y., Tsendbazar, N.-E., de Beurs, K., Päkkilä, L., and Kooistra, L.: Refining global wetland characterization using an unsupervised, wetness-based dynamic framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5862, https://doi.org/10.5194/egusphere-egu26-5862, 2026.

EGU26-6114 | ECS | Orals | HS6.5

Evidential Deep Learning for Uncertainty-Aware Global Flood Extent Segmentation 

Chi-ju Chen and Li-Pen Wang

Flood extent mapping from satellite imagery plays a critical role in disaster response and flood risk management, particularly as flood events become more frequent and severe under a changing climate. At its core, the task involves classifying each pixel in an optical satellite image as flooded or non-flooded. Recent deep learning-based segmentation models have demonstrated strong performance at the global scale. However, despite their accuracy, most existing approaches provide deterministic predictions and offer limited information on the reliability of individual pixel-level outputs. This lack of uncertainty information constrains their operational applicability, especially in high-risk scenarios where models may exhibit overconfident but incorrect predictions.

To address this limitation, we extend a global flood extent segmentation framework by explicitly incorporating uncertainty quantification. Specifically, an Evidential Deep Learning (EDL) approach is integrated into a UNet++ architecture within the ml4floods framework, enabling simultaneous prediction of flood extent and associated pixel-wise uncertainty. Within the EDL formulation, network outputs are interpreted as evidence and parameterised using a Beta distribution, providing a principled estimate of predictive uncertainty. Furthermore, total uncertainty is decomposed into aleatoric and epistemic components, allowing clearer interpretation of whether uncertainty arises from data ambiguity or from limited model knowledge.

The proposed approach is evaluated using the extended WorldFloods global flood dataset. Preliminary results indicate that the EDL-enhanced model maintains promising segmentation performance while producing informative uncertainty maps. Elevated uncertainty is consistently observed in misclassified regions and along land-water boundaries, where optical signals are inherently ambiguous. These results demonstrate that uncertainty estimates offer valuable insight into model reliability and support operational decision-making by highlighting areas that require closer inspection. In practice, uncertainty-guided triage can help prioritise expert review and resource allocation, focusing attention on regions where decision risk is highest.

How to cite: Chen, C. and Wang, L.-P.: Evidential Deep Learning for Uncertainty-Aware Global Flood Extent Segmentation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6114, https://doi.org/10.5194/egusphere-egu26-6114, 2026.

EGU26-6180 | ECS | Orals | HS6.5

 The capabilities of virtual gauging stations in satellite monitoring of water bodies 

Ildar Mukhamedjanov and Gulomjon Umirzakov

Remote sensing technologies provide effective tools for monitoring and assessing the state of inland water bodies, enabling extraction of various hydrological parameters from satellite observation. Central Asian and some African countries are currently implementing practical programs aimed at mitigating water scarcity and improving the management of transboundary water resources. Rivers and their tributaries flowing across national boundaries require continuous monitoring to support early warning of droughts and floods at the basin scale.

Conventional ground-based hydrological stations are traditionally used to measure water level, estimate daily river discharge, and support hydrological forecasting. However, limitations related to accessibility, data-sharing restrictions, and the high cost of installation and maintenance often constrain their spatial coverage and long-term operation.  Virtual gauging station (VGS) represents a complementary remote-sensing approach, providing time series derived from the long-term satellite image archives. A VGS is defined as a free-shaped polygon on the map used to analyze data within the borders of this polygon and collect observations based on the requirements. Currently, VGS applications primarily rely on optical satellite imagery from Sentinel-2, Landsat-4, -5, -7, -8, -9 missions to estimate water surface area (WSA) using spectral water index (MNDWI, AWEI or AWEIsh). Variations in WSA serves as a proxy for surface water availability and river dynamics. 

In addition, VGS can be used to enrich satellite altimetry-based water level (H) time series. For this purpose, the VGS polygon is calibrated using reference altimetric observations obtained from open-access data source (e.g. SDSS, DAHITI, Hydroweb). Calibration involves estimating the parameters of a regression model describing the functional relationship between water level and water surface area.  The resulting values can finally be integrated into hydrological models to support short-term river discharge forecasting. Thus, VGS provides continuous hydrological information independent of ground-based measurements, while optional validation against in-situ observations allows for the assessment of the model uncertainty.  Based on the experimental analysis, optimal placement of VGS polygons is recommended dynamically active river sections that account for annual riverbed displacement, as well as in river reaches located near satellite altimeter ground tracks to improve calibration accuracy.

The experiments demonstrated that correlation between ground truth and forecasted water level values is upper 0,85 and mean absolute error is lower than 0,3 m. The following result has been obtained using linear regression which shows that application of more complex forecasting models could significantly improve the results.

How to cite: Mukhamedjanov, I. and Umirzakov, G.:  The capabilities of virtual gauging stations in satellite monitoring of water bodies, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6180, https://doi.org/10.5194/egusphere-egu26-6180, 2026.

EGU26-6408 | ECS | Posters on site | HS6.5

Multisensor Ensemble Mapping of Sub-hectare Ephemeral Surface Water in Kenyan ASALs 

James Muthoka, Pedram Rowhani, Chloe Hopling, Omid Memarian Sorkhabi, and Martin Todd

Ephemeral pans and seasonal ponds in arid and semi-arid lands supply critical water for pastoral and ecological systems, yet are not routinely monitored due to their small size, highly dynamic and spectral confusion with vegetation and shadows. We present and evaluate a multisensor mapping approach to detect sub-0.5 ha surface water bodies and quantify their linkage to rainfall variability to inform decision making.

Our approach fuses Sentinel-1 SAR, Sentinel-2 optical indices and DEM derived covariates within an ensemble classifier (voting of Random Forest, Gradient Boosting, and Decision Tree models). Predictive uncertainty is mapped using ensemble agreement and class probabilities, and we compare SAR-only, optical-only, terrain-only, and fused configurations. Additionally, rain and ephemeral surface water dynamics are modelled using generalised additive models with CHIRPs  and local rain gauge observations to test the lagged relationships in monthly water area anomalies.

Results show the fused model achieves an overall accuracy of 85%, outperforming Sentinel-1, and Sentinel-2 (78% and 72%, respectively). Generalised additive models explain 62% of variance in monthly water area anomalies, with a strong response at 1-3 month lags. These results show multisensor fusion with  quantified uncertainty improves detection of ephemeral surface water and enables estimation of rainfall thresholds and lagged dynamics relevant to pastoral water planning and targeted anticipatory action interventions.

How to cite: Muthoka, J., Rowhani, P., Hopling, C., Memarian Sorkhabi, O., and Todd, M.: Multisensor Ensemble Mapping of Sub-hectare Ephemeral Surface Water in Kenyan ASALs, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6408, https://doi.org/10.5194/egusphere-egu26-6408, 2026.

EGU26-6586 | ECS | Posters on site | HS6.5

Do Geospatial Foundation Models Improve SAR-Based Flood Mapping?  

Antara Dasgupta and Moetez Zouaidi

Accurate and timely flood delineation is a cornerstone of disaster response and hydrological risk management. Synthetic Aperture Radar (SAR) is uniquely suited to this task because it operates independently of cloud cover and illumination, yet its interpretation remains challenging due to speckle, terrain effects, vegetation scattering, and ambiguities between flooded and permanent water as well as shadows and smooth surfaces such as tarmac. While deep learning has substantially advanced SAR-based flood segmentation, most existing models are trained from scratch and often struggle to generalize across regions and flood regimes. Recently, geospatial foundation models (GFMs) pretrained on massive satellite archives have shown promise, but their benefits for SAR-based flood mapping remain insufficiently quantified. This paper presents a controlled, large-scale global scale evaluation and benchmarking of a vision-transformer based GFM (NASA IBM Prithvi) against two task-specific segmentation architectures, the SegFormer (hierarchical transformer) and the commonly used U-Net (convolutional neural network), including lightweight variants, for post-event SAR-based flood mapping. All models were trained and evaluated under a standardized pipeline that explicitly addresses extreme class imbalance via stratified negative sampling and weighted loss functions. Training and validation used the expert-annotated Kuro Siwo dataset (43 flood events, 67,490 Sentinel-1 VV/VH tiles), while generalization is assessed on both the in-distribution Kuro Siwo test set and the out-of-distribution Sen1Floods11 hand labelled benchmark dataset. Results show that stratified negative sampling (controlling how many background-only tiles are shown to the model in each training epoch) increases precision by approximately 6% and mean Intersection-over-Union (mIoU) by about 7% relative to no sampling, while stabilizing training loss dynamics. On the in-distribution data, all architectures reach similar performance (mIoU ≈ 0.82), indicating that well-designed task-specific models remain competitive with GFMs. However, under out-of-distribution conditions, the foundation model Prithvi (mIoU 0.768) closely matches the performance of the SegFormer (mIoU 0.772) and clearly outperforms the U-Net (mIoU 0.712), highlighting the robustness of transformer-based representations when transferring across datasets. Pretraining on optical imagery yields only modest gains for SAR (+3.4% mIoU), suggesting that architectural inductive biases and data handling matter more than cross-modal pretraining. Notably, lightweight GFM variants achieve comparable accuracy with up to 94% fewer parameters, demonstrating strong potential for operational deployment. Scene-level analysis reveals that CNNs suppress scattered false alarms due to the neighborhood contextualization but miss large, continuous floods, while transformers preserve spatial coherence yet overpredict along complex boundaries and scattered surface water ponding, especially near permanent water bodies. Findings demonstrate that while SAR-based flood mapping accuracy requires a combination of appropriate model architectures and class imbalance-aware training, rather than foundation-scale pretraining alone. However, for spatial and statistical transfer to out of distribution datasets, GFMs offer substantial advantages and provide above-average performance for unseen cases, even without localized fine-tuning.

How to cite: Dasgupta, A. and Zouaidi, M.: Do Geospatial Foundation Models Improve SAR-Based Flood Mapping? , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6586, https://doi.org/10.5194/egusphere-egu26-6586, 2026.

EGU26-6617 | ECS | Posters on site | HS6.5

SARFlood: A Web-Based, Cloud-Native Platform for Automated and Optimized ML-based SAR Flood Mapping    

Patrick Wilhelm, Paul Hosch, and Antara Dasgupta

Synthetic Aperture Radar (SAR) imagery offers weather-independent observation capabilities critical for monitoring flood events. However, SAR-based flood detection workflows typically require specialized software, local computational resources, and expert knowledge in remote sensing. This work presents SARFlood, a web-accessible application that automates the complete SAR flood detection pipeline using the OpenEO platform. SARFlood is built on a Flask backend architecture designed for accessibility and reproducibility. Users interact with the system through a web interface that guides them through case study creation, including Area of Interest (AOI) definition via shapefile upload, event date specification, and optional ground truth data integration. The application implements OpenEO OAuth 2.0 authentication using the device code flow, enabling secure access to the Copernicus Data Space Ecosystem (CDSE) backend without requiring users to manage API credentials locally. Session-based project management allows users to track processing progress in real-time through a status reporting system that monitors each pipeline stage. Data acquisition is performed server-side via OpenEO, while feature engineering processors execute locally. The data acquisition module fetches multiple data sources through a unified OpenEO interface: pre-event and post-event Sentinel-1 VV and VH imagery, Digital Elevation Models (DEM) with automatic source fallback (FABDEM, Copernicus 30m/90m), and ESA WorldCover land cover classification. The OpenStreetMap water body features and the FathomDEM are acquired via their own APIs/websites. A caching system prevents redundant API calls for previously acquired datasets, significantly reducing processing time for iterative analyses, while keeping licensing in mind so only users who are logged in and have the according license will be able to access the cached files. The processing pipeline computes a comprehensive feature stack for flood detection. SAR derivatives include intensity bands, VV/VH polarization ratios, and change detection metrics computed in decibel space to enhance flood signal discrimination. Topographic features encompass slope and Height Above Nearest Drainage (HAND) derived from the DEM, as key indicators of flood susceptibility. Flow direction calculations use an expanded bounding box to determine the extended HAND computation domain to address edge artifacts, finally cropped to the original AOI during band compilation, ensuring computationally efficient and accurate flow routing. Additionally, stream burning is implemented to improve drainage network delineation. Further, contextual features include Euclidean Distance to Water and rasterized land cover classification. Users can currently upload ground truth shapefiles (e.g., Copernicus EMS), which are automatically rasterized and compiled into the output stack, enabling supervised classification workflows.  

SARFlood includes integrated sampling and training modules. Multiple strategies such as Simple Random, Stratified, Generalized Random Tessellation Stratified, and Systematic Grid sampling are supported. The training module implements Random Forest classification with Leave-One-Group-Out Cross-Validation across multiple case studies, hyperparameter optimization via Bayesian search, and feature importance assessment through Mean Decrease Impurity, permutation importance, and SHAP values. The platform-, data- and model-agnostic design principles used in developing SARFlood, support open science and FAIR practices in the geoscience community. By combining web accessibility with robust feature engineering and machine learning integration, SARFlood provides researchers with a reproducible platform for generating uncertainty-aware flood labels lowering barriers to use. 

How to cite: Wilhelm, P., Hosch, P., and Dasgupta, A.: SARFlood: A Web-Based, Cloud-Native Platform for Automated and Optimized ML-based SAR Flood Mapping   , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6617, https://doi.org/10.5194/egusphere-egu26-6617, 2026.

EGU26-7132 | ECS | Orals | HS6.5

Monitoring Freshwater Bodies over the Past 40 Years Using Synthetic Monthly Sentinel-2 MSI Imagery  

Federica Vanzani, Patrice Carbonneau, Simone Bizzi, Martina Cecchetto, and Elisa Bozzolan

In the last decade rapid advancements in remote sensing have opened new frontiers in our ability to monitor freshwater bodies dynamics at the global scale. Most works have taken advantage of the long time series of Landsat constellations (30 m resolution) relying on spectral indices to identify water. Recently, much progress has also been made in the development and use of deep learning models capable of explicit semantic classification of river water, lake water and sediment bars, based on Sentinel-2 (S2) MSI imagery (10 m resolution). In this work, we present an approach that seeks to extend these existing, trained, fluvial landscape classification models to Landsat data in order to observe long-term water and morphological shifts in rivers and lakes. Rather than explicitly re-training the models with Landsat data and labour-intensive manual label data, we apply a domain transfer approach to generate synthetic S2 MSI imagery from Landsat inputs. This approach has the advantage that the training of deep learning domain transfer models only requires synchronous Landsat and Sentinel data and thus obviates the need for manual labels.

The results show that, when using these synthetic images, river water, lake water and sediment bars are classified with an F1 score of 0.8, 0.94, 0.65 respectively, which represents a decrease of ca. 10% for river water and 20% for sediment with respect to real S2 imagery. By adopting this integrated approach, we are therefore able to monitor, for the first time, lake water, river water and sediment bars at 10 m resolution, over a 40-year period, integrating both synthetic S2 and real S2 acquisitions through a single, fluvial landscape segmentation model. Classification obtained from median monthly images can then be aggregated at the yearly or multi-yearly scale to delineate river or lake water fluctuations, and active channels (river water plus sediment bars) trajectories, from specific freshwater bodies to the global scale.

How to cite: Vanzani, F., Carbonneau, P., Bizzi, S., Cecchetto, M., and Bozzolan, E.: Monitoring Freshwater Bodies over the Past 40 Years Using Synthetic Monthly Sentinel-2 MSI Imagery , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7132, https://doi.org/10.5194/egusphere-egu26-7132, 2026.

EGU26-7320 | ECS | Posters on site | HS6.5

Evaluating multimodal optical and SAR learning strategies for flood and surface water delineation 

jiayin xiao, zixi li, and fuqiang tian

Flood and surface water mapping from satellite observations remains challenging due to the complementary yet heterogeneous characteristics
of optical and synthetic aperture radar (SAR) data. While deep learning has achieved promising results, existing studies are often evaluated on
isolated datasets or focus on a single modality, limiting their comparability and operational relevance. In this study, we conduct a large-scale and systematic evaluation of optical, SAR, and combined optical–SAR learning strategies for flood and surface water mapping across multiple public satellite benchmarks. Using a common training and evaluation protocol, we compare lightweight convolutional networks and large pretrained vision models under single-modality and multimodal settings. The analysis reveals that attention-based multimodal fusion consistently improves water delineation accuracy on most datasets, while model capacity and preprocessing choices play a critical role in balancing missed detections and false alarms. On global-scale benchmarks, moderately sized backbones coupled with dedicated fusion mechanisms achieve robust performance without relying on extremely large models.These findings provide practical guidance for selecting architectures and fusion strategies in operational flood mapping and establish a reproducible benchmark for future optical and SAR studies.

How to cite: xiao, J., li, Z., and tian, F.: Evaluating multimodal optical and SAR learning strategies for flood and surface water delineation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7320, https://doi.org/10.5194/egusphere-egu26-7320, 2026.

EGU26-7998 | Orals | HS6.5

Ten years of floods across Europe mapped from space with reconstructed water depths  

Andrea Betterle and Peter Salamon

Floods are among the most deadly and destructive natural disasters. Improving our understanding of large-scale flood dynamics is crucial to mitigating their dramatic consequences. Unfortunately, systematic observation-based datasets—especially featuring flood depths—have been lacking.

This contribution presents advancements in developing an unprecedented catalogue of satellite-derived flood maps across Europe from 2015 onwards. Results are based on the systematic identification of floods in the entire Sentinel-1 archive at 20 m spatial resolution as provided by the Global Flood Monitoring component of the Copernicus Emergency Management Service. Using a novel algorithm that accounts for terrain topography, flood maps are enhanced and provided with water depth estimates—a critically important information for flood impact assessments.

The resulting dataset represents a significant step towards the creation of a global flood archive. It provides new tools for interpreting flood hazards on large scales, with substantial implications for flood risk reduction, urban development planning, and emergency response.

How to cite: Betterle, A. and Salamon, P.: Ten years of floods across Europe mapped from space with reconstructed water depths , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7998, https://doi.org/10.5194/egusphere-egu26-7998, 2026.

EGU26-8292 | Posters on site | HS6.5

Modelling wetland resilience to climate change and anthropogenic impacts. 

Patricia Saco, Rodriguez Jose, Breda Angelo, Eric Sandi, and Steven Sandi

Coastal wetlands provide a wide range of ecosystem services, including shoreline protection, attenuation of storm surges and floods, water quality improvement, wildlife habitat and biodiversity conservation. These ecosystems have been observed to sequester atmospheric carbon dioxide at rates significantly higher than many other ecosystems, positioning them as promising nature-based solutions for climate change mitigation.  However, projections of coastal wetland conditions under sea-level rise (SLR) remain highly variable, owing to uncertainties in environmental factors as well as the necessary simplifications embedded within the wetland evolution modelling frameworks. Assessing wetland resilience to rising sea levels and the effect of anthropogenic activities is inherently complex, given the uncertain nature of key processes and external influences. To enable long-term simulations that span extensive temporal and spatial scales, models must rely on a range of assumptions and simplifications—some of which may significantly affect the interpretation of wetland resilience.

 

Here we present a novel eco-hydro-geomorphological modelling framework to predict wetland evolution under SLR. We explore how accretion and lateral migration processes influence the response of coastal wetlands to SLR, using a computational framework that integrates detailed hydrodynamic and sediment transport processes. This framework captures the interactions between physical processes, vegetation, and landscape dynamics, while remaining computationally efficient enough to support simulations over extended timeframes. We examine several common simplifications employed in models of coastal wetland evolution and attempt to quantify their influence on model outputs. We focus on simplifications related to hydrodynamics, sediment transport, and vegetation dynamics, particularly in terms of process representation, interactions between processes, and spatial and temporal discretisation. Special attention is given to identifying modelling approaches that strike a balance between computational efficiency and acceptable levels of accuracy. We will present recent model results to assess the resilience of coastal wetland to SLR on several sites around the world and will discuss new results to assess the effect of human interventions and infrastructure on wetland resilience.

How to cite: Saco, P., Jose, R., Angelo, B., Sandi, E., and Sandi, S.: Modelling wetland resilience to climate change and anthropogenic impacts., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8292, https://doi.org/10.5194/egusphere-egu26-8292, 2026.

EGU26-9354 | ECS | Orals | HS6.5

L-band InSAR to complement SAR inundation mapping under vegetation 

Clara Hübinger, Etienne Fluet-Chouinard, Daniel Escobar, and Fernando Jaramillo

Wetland inundation dynamics are key for understanding flood regulation, ecosystem functioning and greenhouse gas emissions. Synthetic Aperture Radar (SAR) can map water extent independent of cloud cover and can partly penetrate vegetation, particularly at L-band. Many SAR inundation products rely primarily on intensity thresholding and indicators such as specular reflection and double-bounce scattering. However, these approaches can underestimate inundation extent in densely vegetated wetlands where volume scattering can obscure the water signal. Here we demonstrate how L-band interferometric SAR (InSAR) can complement intensity-based inundation mapping under vegetation by exploiting phase differences between repeat SAR acquisitions. Using ALOS PALSAR-1 and PALSAR-2, together providing a nearly two-decade observational archive, we show that L-band InSAR can capture inundation dynamics in tropical floodplain wetlands, such as the Atrato floodplain (Colombia) and Amazon várzea floodplains (e.g., along the Río Pastaza). In the Atrato floodplain, the InSAR-derived flooded vegetation extent shows pronounced seasonal variability, ranging from ~500 to >1500 km² during 2007–2011. Comparison with existing L-band SAR inundation products yields ~70% overall agreement, while InSAR consistently detects broader inundated extents in densely vegetated floodplain areas where intensity-based thresholding underestimates inundation. This complementarity among methodologies is particularly relevant for inundation extent data products from the NASA–ISRO NISAR mission, which are expected to rely largely on SAR backscatter thresholding. Our results highlight the value of integrating InSAR-derived information to strengthen wetland inundation monitoring under vegetated canopies.

How to cite: Hübinger, C., Fluet-Chouinard, E., Escobar, D., and Jaramillo, F.: L-band InSAR to complement SAR inundation mapping under vegetation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9354, https://doi.org/10.5194/egusphere-egu26-9354, 2026.

EGU26-9758 | ECS | Orals | HS6.5

Hydrologically-Informed DTM Super-Resolution for Rapid Flood Depth Estimation 

Sandro Groth, Marc Wieland, Christian Geiß, and Sandro Martinis
Reliable estimation of flood depths from satellite-derived inundation extent information critically depends on the spatial resolution and hydrological consistency of the underlying digital terrain model (DTM). Accurate, very high–resolution DTMs are typically not publicly available, difficult to access within the time constraints of rapid mapping, and lack consistent coverage. Although open-access DTMs such as the Forest and Buildings removed Copernicus DEM (FABDEM) provide global coverage, their coarse spatial resolution often fails to represent important small-scale terrain features that control flow paths, slopes, and local water accumulation. To address these limitations, this study proposes a deep learning framework for DTM super-resolution that combines low-resolution DTMs with optical satellite imagery by integrating hydrological knowledge into the training process to force the reconstruction of relevant topographic features for improved flood inundation depth estimation.

The proposed approach employs a residual channel attention network (RCAN) enhanced with optical satellite imagery as auxiliary input to upscale low-resolution terrain data. Central to the methodology is a collaborative hydrologic loss function that guides network optimization beyond elevation-based accuracy. In addition to the mean absolute elevation error (MAE), the loss integrates slope deviation and flow direction disagreement to focus the learning on the reconstruction of terrain features that are directly relevant for hydrologic applications.

Unlike other super-resolution approaches, which are often using downscaled versions of the low-resolution inputs to learn super-resolved DTMs, the proposed framework was trained on a growing set of aligned patches of real-world globally available low-resolution elevation data, optical satellite imagery, and high-resolution reference DTMs derived from airborne LiDAR. Model performance is evaluated against conventional interpolation and standard super-resolution baseline architectures, including convolutional neural networks (CNN) as well as geospatial foundation models (GFM). To assess the practical impact on flood mapping, the super-resolved DTMs are tested on a set of real-world flood events in Germany by using the well-known Flood Extent Enhancement and Water Depth Estimation Tool (FLEXTH) to derive inundation depth metrics.

Results show that integrating DTMs derived using hydrologically guided super-resolution into flood depth tools can lead to more accurate flood depth estimates compared to low-resolution or other super-resolved inputs. The added hydrologic loss significantly improves the preservation of slopes and flow directions while maintaining elevation accuracy.

Overall, the presented framework offers a method to generate hydrologically meaningful high-resolution DTMs from globally available low-resolution inputs to benefit flood depth estimation in areas, where no high-resolution terrain information is available.

How to cite: Groth, S., Wieland, M., Geiß, C., and Martinis, S.: Hydrologically-Informed DTM Super-Resolution for Rapid Flood Depth Estimation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9758, https://doi.org/10.5194/egusphere-egu26-9758, 2026.

Flash flood disasters have increased by more than 50% in the first 20 years of the 21st century compared to the last 20 years of the 20th century. Monitoring and understanding flood events might lead to better mitigation of this natural hazard. Using SAR and SAR interferometry (InSAR) proved to be a useful tool for mapping flooded areas due to the lower backscatter or decorrelation of the SAR signal in an open-water environment. In Arid regiem, flash flood water is rapidly drained by evaporation or percolation, often before the satellite image is acquired. To overcome this challenge, we propose in this study to use the InSAR coherency loss, created by surface changes during a flash-flood, to map the runoff path and utilize it to quantify peak discharge (Qmax).

We focus on the Ze’elim alluvial fan along the western shore of the Dead Sea, Israel, an arid area affected by seasonal flash floods a few days a year. We use 34 interferograms of X-band (COSMO-SkyMed/TerraSAR-X) SAR data, covering 25 runoff events between 2017 and 2021, and upstream hydrological gauge data. To consider the natural decorrelation processes, we calculate a normalized coherence (ϒn) term, using the average coherence of the study area and the average coherence of a stable reference area, identified by differential LiDAR measurements.

We find a strong correlation between gn and the logarithm of the peak discharge (Qmax). However, the method is limited by a minimal peak discharge—where energy is too low to change the surface—and maximal total water volume—where decorrelation is saturated. The method may provide tools for reconstructing runoff data in arid areas where historical SAR data is available, and for monitoring in difficult access areas or where hydrological stations are sparse or damaged.

How to cite: Nof, R.: Estimating Flash Flood Discharge in Arid Environments Using InSAR Coherence: A Case Study of the Ze’elim Fan, Dead Sea, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11948, https://doi.org/10.5194/egusphere-egu26-11948, 2026.

EGU26-12249 | Orals | HS6.5 | Highlight

Lessons Learned from Remote Sensing of River Ice for Flood Early Warning 

Arjen Haag, Tycho Bovenschen, Elena Vandebroek, Athanasios Tsiokanos, Ben Balk, and Joost van der Sanden

Rivers in regions with cold winters can seasonally freeze up. River ice breakup and freeze-up processes can lead to river ice jams, which are a major contributor to flood risk in cold regions (across most of the high latitudes of the northern hemisphere). In Canada, satellite remote sensing is used across the country to provide timely information on the status of river ice. Methods and algorithms to classify various stages of river ice from the Radarsat Constellation Mission (RCM) are available, but the operational implementation of these, especially the integration into larger flood forecasting and early warning systems, requires specific expertise, software and computational resources, and comes with its own set of challenges. In collaboration with various agencies across Canada we have set up operational monitoring systems with the purpose of assisting the daily tasks of forecasters on duty. These have been used in practice over multiple ice breakup and freeze-up seasons, which has highlighted both their usefulness and shortcomings. We will focus on various aspects of such a system and share lessons learned on its design, setup and operational use, as well as a framework to analyse various factors relevant for operational monitoring purposes (e.g. spatiotemporal coverage and latency of the data, critical elements in the support of decision-making relating to floods). In this, we do not shy away from problems and pitfalls, so that others can learn from these. While various challenges remain, this work is a good example of the value in the joint engagement of applied science and end users.

How to cite: Haag, A., Bovenschen, T., Vandebroek, E., Tsiokanos, A., Balk, B., and van der Sanden, J.: Lessons Learned from Remote Sensing of River Ice for Flood Early Warning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12249, https://doi.org/10.5194/egusphere-egu26-12249, 2026.

EGU26-13343 | Posters on site | HS6.5

Operational, national-scale monitoring of river trajectories using satellite imagery  

Elisa Bozzolan, Marco Micotti, Elisa Matteligh, Alessandro Piovesan, Federica Vanzani, Patrice Carbonneau, and Simone Bizzi

The global degradation of river ecosystems and the growing impacts of flood hazards have highlighted limitations in current river management approaches. In Europe, the Water Framework and Flood Directives promote integrated, catchment-scale assessments of hydromorphological conditions and flood risk. Such integration is essential for sustainable management. Planform dynamics and river bed aggradation/incision, for example, can modify channel conveyance and compromise flood mitigation measures, whereas granting more space to rivers can both enhance ecological quality and reduce flood peaks.

In this context, the availability of long-term satellite archives and advances in computational and machine-learning methods enable large-scale, high spatiotemporal resolution monitoring of large and medium river systems. However, despite this potential, the operational adoption of satellite-based river monitoring remains limited due to data complexity, interdisciplinary requirements, and the lack of harmonised computational infrastructures.

Thanks to a collaboration between industry, public institutions and the university, we developed a methodology to systematically map monthly water channel, channel width, sediment bars and vegetation dynamics, testing the results on the full archive of Sentinel-2 (10 m resolution) for medium-large Italian rivers (active channel > 30m - i.e. 3 Sentinel-2 pixels). In this talk, I will outline the applied methodology, discuss its applicability at national scale with Sentinel-2 data, and show how the generated products can better inform river habitat mapping, river conservation practices, and flood risk assessments by supporting consistent national scale geomorphic trajectories identification.

How to cite: Bozzolan, E., Micotti, M., Matteligh, E., Piovesan, A., Vanzani, F., Carbonneau, P., and Bizzi, S.: Operational, national-scale monitoring of river trajectories using satellite imagery , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13343, https://doi.org/10.5194/egusphere-egu26-13343, 2026.

Flood inundation mapping has become increasingly critical as climate change intensifies the frequency and severity of flooding worldwide, amplifying risks to populations, infrastructure, and ecosystems. Recent advances in Earth Observation (EO) have shown unprecedented opportunities to monitor flood dynamics across large spatial scales.. However, significant challenges remain due to the limitations of single-sensor approaches. While multispectral imagery provides rich semantic information, it is frequently constrained by cloud cover during flood events. Conversely, Synthetic Aperture Radar (SAR) offers all-weather capability but suffers from signal ambiguity in complex terrains and urban environments. Effectively integrating these heterogeneous modalities therefore remains a challenge, particularly with limited labelled flood event data.

In this study, we propose a deep learning-based cross-modal fusion framework that leverages the representational capacity of Remote Sensing Foundation Models (RSFMs). High-level feature embeddings are extracted from Sentinel-1 and Sentinel-2 multispectral imagery by initializing modality-specific encoders with pretrained weights from state-of-the art multi-modal foundation models, providing a robust and semantically aligned feature space despite limited task-specific training data 

To integrate the multi-modal representations, we adopt a Gated Cross-Modal Attention mechanism, which adaptively modulates the information flow from each modality based on their observation reliability. Specifically, the model is trained to prioritise SAR features to ensure spatial continuity under cloud-obscured conditions, while simultaneously leveraging richer optical semantics to disambiguate SAR signals, correcting for example false detections caused by radar shadowing or smooth impervious surfaces. 

To assess the generalisation of the proposed framework across diverse regions and sensor conditions, we trained and evaluated our model using a comprehensive dataset compiled from publicly available benchmarks, including Kuro Siwo and WorldFloods. Our framework not only establishes a new benchmark for all-weather flood monitoring but also demonstrates the critical role of remote sensing foundation models in overcoming the limitations of traditional, data-hungry fusion approaches.

How to cite: Chen, Y. C. and Wang, L. P.: Integrating SAR and Multispectral Satellite Observations for Flood Inundation Mapping: A Cross-Modal Fusion Framework Leveraging Foundation Models and Gated Attention Mechanism, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13502, https://doi.org/10.5194/egusphere-egu26-13502, 2026.

EGU26-13888 | ECS | Posters on site | HS6.5

A Comparative Assessment of Threshold-Based and Machine Learning Methods for Flood Detection 

Jawad Mones, Saeed Mhanna, Landon Halloran, and Philip Brunner

 

Flood mapping plays a key role in understanding hazard impacts, supporting emergency response, and guiding long-term risk planning. Remote sensing is now widely used in flood studies because it offers low-cost data, avoids the need for dangerous field surveys, and provides rapid observations over large areas. Despite these advantages, comparative research remains limited, particularly with respect to differences among flood-mapping algorithms, such as machine-learning versus threshold-based approaches, and the performance of optical versus radar sensors. This research addresses these gaps by applying multiple flood-mapping methods to the same flood event in Pakistan, and then comparing their performance with respect to a validation benchmark to provide a clearer insight into how data selection and methodological design influence flood detection outcomes

This study evaluates four distinct methods for mapping floods using multi-sensor satellite data. To ensure a fair comparison, three unsupervised machine-learning approaches including a synergetic Sentinel-1 and Sentinel-2 workflow, a method integrating harmonized Landsat–Sentinel data with radar, and a daily MODIS imagery technique were tested alongside a traditional Otsu thresholding baseline. All four were tested on the same 2025 Pakistan flood event, characterized by intense monsoon rains and flash flooding across regions such as Sindh and Punjab in mid- to late-2025.  The flood maps were then validated against UNOSAT flood reports for this event, where UNOSAT’s flood extent closely matches the results produced by the Sentinel-1/Sentinel-2 workflow, which yields the most conservative flood extent among the tested methods.

 Larger flood extents from some methods, especially the Sentinel-1 Otsu thresholding approach, include areas not clearly flooded in optical images. This happens because SAR backscatter also responds to wet soil and saturated vegetation, which a simple threshold can misclassify as water, leading to flood overestimation.

Overall, the results show that flood maps are not just different versions of the same answer, they reflect different satellite data and the utilized algorithms detect flooding. Approaches that combine multiple data sources with machine-learning strike a better balance, producing flood extents that are both spatially consistent and physically realistic. This indicates that multi-sensor, machine-learning–based methods are better suited for operational flood monitoring than simple thresholding, which is too sensitive to surface noise and often overestimates flooding. 

How to cite: Mones, J., Mhanna, S., Halloran, L., and Brunner, P.: A Comparative Assessment of Threshold-Based and Machine Learning Methods for Flood Detection, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13888, https://doi.org/10.5194/egusphere-egu26-13888, 2026.

EGU26-16468 | ECS | Orals | HS6.5

Multidecadal Changes and Trends in Global River Positions 

Elad Dente, John Gardner, Theodore Langhorst, and Xiao Yang

Rivers play a central role in shaping the Earth's surface and ecosystems through physical, chemical, and biological interactions. The intensity and locations of these interactions change as rivers continuously migrate across the landscape. In recent decades, human activity and climate change have altered river hydrology and sediment fluxes, leading to changes in river position, or migration. However, a comprehensive perspective on and understanding of these recent changes in the rate of river position shifts is lacking. To address this knowledge gap, we created a continuous global dataset of yearly river positions and migration rates over the past four decades and analyzed trends. The global annual river positions were detected using Landsat-derived surface water datasets and processed in Google Earth Engine, a cloud-based parallel computation platform. The resulting river extents and centerlines reflect the yearly permanent position, corresponding to the rivers’ location during base flow. This approach improves the representation of position changes derived from geomorphological rather than hydrological processes. To robustly analyze river position changes across different patterns and complexities and at large scales, we developed and applied a global reach-based quantification method.

Results show that while alluvial rivers maintain stable positions in certain regions, others exhibit trends in the rates of position change. For instance, the Amazon Basin, which has experienced significant deforestation and hydrological modifications, has shown increased rates of river position change in recent decades, directly modifying active floodplains. In this presentation, we will discuss the advantages, limitations, and applications of the global yearly river position dataset, offer insights into the changing rates of river position, and highlight current and future impacts on one of Earth’s most vulnerable hydrologic systems.

How to cite: Dente, E., Gardner, J., Langhorst, T., and Yang, X.: Multidecadal Changes and Trends in Global River Positions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16468, https://doi.org/10.5194/egusphere-egu26-16468, 2026.

Satellite-based surface water monitoring is essential for traking the spatiotemporal dynamics of global water bodies. However, most existing systems rely on a single mission or sensor modality, constraining both accuracy and temporal coverage. To overcome these limitations, we propose a multi-mission data fusion framework that integrates SAR Sentinel-1 and optical Sentinel-2 observations. Two U-Net convolutional neural networks were trained independently on the S1S2-Water dataset: one using Sentinel-1 sigma-nought backscatter (VV/VH) and the other using Sentinel-2 RGB and NIR bands, with terrain slope incorporated as ancillary input in both models. Predictive uncertainty is quantified via Monte Carlo dropout embedded within the networks, modeling pixel-wise predictions as Gaussian distributions. These probabilistic outputs are subsequently fused using a Bayesian framework and refined through sensor-specific exclusion masks. Evaluation across 16 geographically diverse test sites demonstrates that the fused probabilistic predictions achieve an overall IoU of 89%, highlighting the synergistic benefits of uncertainty-aware, multi-sensor integration. Furthermore, we show that model evaluation restricted to cloud-free optical imagery introduces substantial bias, limiting applicability for near-real-time monitoring. The proposed framework improves temporal availability, robustness, and reliability, advancing multi-satellite approaches for global surface water monitoring.

How to cite: Hassaan, M., Festa, D., and Wagner, W.: SAR and optical imagery for dynamic global surface water monitoring: addressing sensor-specific uncertainty for data fusion, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17524, https://doi.org/10.5194/egusphere-egu26-17524, 2026.

EGU26-18308 | Orals | HS6.5

RESCUE_SAT project: Leveraging Satellite Data to Improve Large‑Scale Flood Modeling 

Elena Volpi, Stefano Cipollini, Luciano Pavesi, Valerio Gagliardi, Richard Mwangi, Giorgia Sanvitale, Irene Pomarico, Aldo Fiori, Deodato Tapete, Maria Virelli, Alessandro Ursi, and Andrea Benedetto

The RESCUE_SAT project was launched as part of the “Innovation for Downstream Preparation for Science” (I4DP_SCIENCE) programme (Agreement no. 2025‑2‑HB.0), funded by the Italian Space Agency (ASI), with the goal of enhancing the performance of the RESCUE model through the integration of satellite data. RESCUE is a large‑scale inundation model that enables probabilistic flood‑hazard assessment over large areas by preserving computational efficiency while explicitly representing hydrologic-hydraulic processes along the full drainage network. Primarily based on digital terrain models (DTMs), RESCUE is a hybrid framework that combines a geomorphology-based representation of the river network with simplified hydrological and hydraulic formulations to estimate water levels and inundation extents. The central challenge of the RESCUE_SAT project is to deliver a flood‑modelling tool capable of providing a more reliable and detailed representation of both large‑scale hydrological behavior and local hydraulic processes, including flow interactions with structures such as levees, bridges and dams which are currently not explicitly represented in RESCUE. To this purpose, the Synthetic Aperture Radar (SAR) imagery acquired by the ASI’s COSMO-SkyMed constellation is processed using interferometric techniques to derive high-resolution digital elevation models (DEMs), reaching meter-scale resolution. Starting from high-resolution DEMs derived from COSMO-SkyMed satellite imagery, RESCUE_SAT enables the identification of the locations of structures that interacts with flow propagation, supporting their systematic mapping. Once the infrastructures have been identified and parameterized from the high-resolution DEM, the DEM is resampled and processed to a computationally advantageous coarser resolution, while the detected infrastructure elements are directly integrated into the hydrological–hydraulic model.

How to cite: Volpi, E., Cipollini, S., Pavesi, L., Gagliardi, V., Mwangi, R., Sanvitale, G., Pomarico, I., Fiori, A., Tapete, D., Virelli, M., Ursi, A., and Benedetto, A.: RESCUE_SAT project: Leveraging Satellite Data to Improve Large‑Scale Flood Modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18308, https://doi.org/10.5194/egusphere-egu26-18308, 2026.

EGU26-18518 | Orals | HS6.5

Automated Detection of Flood Events from CYGNSS: Observing Flood Evolution Along Propagating Tropical Waves  

Zofia Bałdysz, Dariusz B. Baranowski, Piotr J. Flatau, Maria K. Flatau, and Clara Chew

Flooding is a major natural hazard across the global tropics. Although flood occurrence is shaped by rainfall characteristics—including duration, frequency, and intensity—accurate prediction remains challenging. A key limitation is the lack of reliable, long-term flood databases that capture events across all spatial scales and durations, hindering a clear understanding of how rainfall variability translates into flood onset. This limitation is particularly critical in the Maritime Continent, where extreme rainfall is common and many small, short-lived, yet severe, floods remain undocumented. To address this limitation, we investigate whether a relatively new approach, global navigation satellite system reflectometry (GNSS-R), can help close this observational gap.

In this work, we assess whether data from the CYGNSS small-satellite constellation can be used to identify small- to regional-scale floods, including short-lived events. Our study focuses on Sumatra, an island within the Maritime Continent that is frequently affected by such hazards. A joint analysis of CYGNSS inundation estimates and two independent flood databases allowed us to evaluate how CYGNSS measurements can be used for flood detection. Three detailed case studies demonstrate that CYGNSS provides an unprecedented ability to monitor day-to-day changes in surface water extent, including floods at the urban scale. Specifically, we show that CYGNSS-derived inundation anomalies can clearly capture evolution of a flooding event, with the largest signature one day after known flood initiation. A systematic analysis of 555 flood events over a 21-month period enabled us to identify characteristic patterns in inundation anomalies that reliably distinguish flood events from non-flooding conditions, through the definition of an inundation-anomaly threshold and a maximum distance between CYGNSS detections and reported flood locations. We established that CYGNSS observations within 15 km not-only significantly differ from base-line conditions, but they allow tracking day-to-day flood dynamics as well.

The proposed methodology is transferable and can be applied to establish flood-inundation thresholds for any region within the global tropics, enabling automated detection of previously unreported flood events or the study of relationships between extreme precipitation and flood evolution. An example of its application is the automatic detection of flooding from CYGNSS data associated with subseasonal variability in tropical circulation: the passage of multiple convectively coupled Kelvin waves embedded within an active Madden–Julian Oscillation in July 2021. These waves propagated eastward across the Maritime Continent, triggering extreme rainfall and widespread flooding in equatorial Indonesia and East Malaysia. The day-to-day evolution of floods could be observed alongside the propagating waves, with the termination of the MJO coinciding with the cessation of the flood events.

Relying on low-cost small satellites, this approach shows strong potential for future scalability with larger constellations, ultimately improving flood monitoring and advancing our understanding of how rainfall patterns shape flood dynamics across global tropics.

How to cite: Bałdysz, Z., Baranowski, D. B., Flatau, P. J., Flatau, M. K., and Chew, C.: Automated Detection of Flood Events from CYGNSS: Observing Flood Evolution Along Propagating Tropical Waves , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18518, https://doi.org/10.5194/egusphere-egu26-18518, 2026.

Accurate long-term monitoring of surface water dynamics in the Niger River and Lake Chad basins is crucial for regional ecological security and sustainable water resource management. However, such monitoring is often hindered by insufficient continuous high-frequency observations—necessary to capture rapid shifts between permanent and seasonal water bodies in semi-arid transition zones—as well as by persistent cloud cover. To address these limitations, we developed a spatio-temporal data fusion framework designed to delineate detailed evolutionary patterns and regime shifts in surface water. Our methodology integrates Sentinel-1 SAR, Sentinel-2 optical imagery, and digital elevation model (DEM) data, adopting a “zoning modeling” strategy to reduce sensor-specific biases and environmental noise, thereby producing annual and seasonal surface water distribution maps. Furthermore, we developed a pixel-level, climate-coupled model based on inundation frequency to quantify changes in the extent, timing, and type of water bodies across a multi-year time series. Integration of these outputs elucidated the spatial heterogeneity of water resources throughout the study region from 2015 to 2024. Validation using randomly distributed reference samples demonstrated strong consistency, with overall accuracy exceeding 90%, confirming the robustness of our framework. Through an ecology-oriented classification scheme, we identified permanent water bodies—largely concentrated in the southern reaches of the Niger River main channel and the central zone of Lake Chad—as serving a “core support” function within the ecosystem. In contrast, seasonal water bodies followed a “dense in the south, sparse in the north” spatial pattern and acted as critical “ecological buffers” for arid northern areas. Notably, seasonal water extent expanded significantly during high-rainfall years such as 2018 and 2022, underscoring its pronounced sensitivity to climatic variability. Compared with current state-of-the-art approaches, the proposed framework enables characterization of high-frequency surface water dynamics and associated ecological interactions as continuous spatio-temporal fields, thereby providing a reliable and scalable tool to inform sustainable watershed management strategies across Africa.

How to cite: Du, L., You, S., Ye, F., and He, Y.: Tracking Dynamic Regimes and Ecological Functions of Surface Water in the Niger-Lake Chad Basins through Multi-Source Fusion (2015–2024), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19055, https://doi.org/10.5194/egusphere-egu26-19055, 2026.

EGU26-19963 | ECS | Orals | HS6.5

Development of routine flood mapping using SAR satellite observation for long-term monitoring system in the flood-prone regions, Cambodia 

Chhenglang Heng, Vannak Ann, Thibault Catry, Vincent Herbreteau, Cyprien Alexandre, and Renaud Hostache

Monitoring inland surface water in near-real time is a key challenge in cloud-prone tropical regions.  Recently, Synthetic Aperture Radar (SAR) products have been widely used to detect surface water. Our area of interest, the Tonle Sap Lake region is a complex environment where very large areas and floodplains are partially or fully submerged seasonally. As the population living around the lake strongly rely on the seasonal flooding dynamics for their socio-economic activities and can at the same time be at risk due to extreme flooding events, it is of main importance to develop tools for the monitoring of flooded areas. In this context, we are adopting and evaluating an algorithm which relies on parametric thresholding, and region growing approaches applied over time series of Sentinel-1 (S1) SAR backscatter images (VV and VH). To evaluate the produced water extent maps based on VV and VH polarizations, we used a cross evaluation using multi-sensor products: high-resolution optical data such as Sentinel-2 (S2) and the coarser resolution Sakamoto flood extend derived from MODIS product. The comparison is made using the Critical Success Index (CSI) and Kappa coefficient performance metrics. During the dry season, the VV polarization demonstrated very good performance using S2-derived maps as a reference, with CSI of 0.84 and a Kappa coefficient of 0.91, indicating highly accurate surface water detection. Performance was similar using the Sakamoto product as a reference (CSI=0.87). However, performance dropped during the rainy season, with the VV polarization's CSI decreasing to 0.76 comparing S2, reflecting challenges in detecting water in the extensive flooded vegetation areas. VH polarization consistently overestimated water extent by misclassifying wet vegetation and rice fields. A merge of VV and VH product yielded an intermediate performance, improving water detection in vegetated areas compared to VV alone. This comprehensive, multi-sensor and multi-season assessment clarifies the specific strengths of each S1 polarization, showing VV's superiority for open water mapping, especially in the dry season. It underscores the importance of selecting the appropriate product (VV for open water, merged for total inundation) and considering seasonal context for operational monitoring, thereby demonstrating the algorithm's robustness while also defining its operational limitations.

How to cite: Heng, C., Ann, V., Catry, T., Herbreteau, V., Alexandre, C., and Hostache, R.: Development of routine flood mapping using SAR satellite observation for long-term monitoring system in the flood-prone regions, Cambodia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19963, https://doi.org/10.5194/egusphere-egu26-19963, 2026.

The research focused on developing the framework for assessing marine, nearshore and transitional waters across Ireland and validated for generalization of the framework across at any geospatial scale using remote sensing (RS) products. To the best of authors knowledge, existing most of the studies only have demonstrated for retrieving particular water quality (WQ) indicators like turbidity, salinity or chlorophyll a without in depth validation results. Recently the authors comprehensively reviewed several studies focusing on the RS applications for assessing WQ using computational intelligence techniques (CIT) like machine learning, artificial intelligence, statistical approaches etc. Unfortunately, the reviewed findings reveals that most of the research are questionable in terms of using data transparency, and validation with independent or other geospatial domains applications of the existing developed tools. Therefore, the research aim was to develop a novel framework and validated with independent datasets including new domain(s) adaptation or validation. For developing the framework, to achieve the goal of the research, the study utilized Sentinel-3 (S3) OLCI RS reflectance data. For obtaining RS data, the study utilized S3-OLCI level 3(L3) and level 4 (L4) reflectance data Rhow_1 to Rhow_11 form the Copernicus Marine Services (CMS) repository datasets for 2016 to 2024. To obtain the overall WQ, the research considered 49 (in-situ) EPA, Ireland monitoring sites across various transitional and coastal waterbodies for computing the overall WQ (IEWQI scores) scores using recently developed and widely validated the IEWQI model. After than the RS data prepared and match-up with 49 considering monitoring sites. For predicting IEWQI scores, the research utilized the multi-scale signal processing framework (MSSPF) by following configurations: data augmentations: 2x to 20x, noise level from 0.0001 to 0.05, and data spilled ratios 60-20-20 and 70-20-10, respectively for train, test and validation of 43 CIT models using RS data from 2016 to 2023 both L3 and L4, whereas the 2024 dataset using for testing independent dataset to generalize the model prediction capabilities. Utilizing four identical model performance evaluation metrics, the results reveals that the PyTorchMLP could be effective (train performance : R2 = 0.86, RMSE =0.09, MSE = 0.008, and MAE = 0.067; test performance : R2 = 0.84, RMSE =0.094, MSE = 0.008, and MAE = 0.071; and validation performance : R2 = 0.81, RMSE =0.095, MSE = 0.009, and MAE = 0.074, respectively at 7x augmentation with 0.0001 of noise level for 60-20-20) compared to the 43 CIT models in terms of predicting and validating independent dataset (independent dataset validation performance for 2024 : R2 = 0.62, RMSE =0.164, MSE = 0.026, and MAE = 0.12). Based on the predicted IEWQI scores, the WQ ranked “marginal”, “fair” and “good” categories for Irish waterbodies. The findings of the framework align with the traditional EPA, Ireland monitoring approaches. However, findings of the research reveals that the proposed framework could be effective to monitoring WQ general purposes using RS data across any geospatial resolution.

Keywords: remote sensing; Copernicus database; MSSPF, IEWQI, Ireland.

How to cite: Uddin, M. G., Diganta, M. T. M., Sajib, A. M., Rahman, A., and Indiana, O.: A comprehensive framework for assessing marine, nearshore and transitional waters quality integrating Irish Water quality Index (IEWQI) model from remote sensing products using computational intelligence techniques, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20016, https://doi.org/10.5194/egusphere-egu26-20016, 2026.

EGU26-20097 | ECS | Orals | HS6.5

Comprehensive validation of the benefits of multi-sensor flood monitoring 

Chloe Campo, Paolo Tamagnone, Guy Schumann, Trinh Duc Tran, Suelynn Choy, and Yuriy Kuleshov

Multi-sensor methodologies are gaining traction within flood monitoring research, grounded in the rationale that data fusion from diverse sources mitigates uncertainty and improves spatiotemporal coverage. However, these assumed benefits are rarely quantified.

This work aims to comprehensively compare the performances of multi-sensor and single-sensor approaches to understand to what extent increasing the number and variegate data source may improve the detection rate and temporal characterisation of flood events. A multi-sensor flood monitoring approach using AMSR2 and VIIRS data is assessed against each sensor individually and against standard benchmarks in EO-based flood detection (e.g., MODIS and Sentinel-1)  for major flood events in the Savannakhet Province of Laos.

The comparative analysis evaluates multiple metrics. First, detection comparison classifies events as captured by each considered approach, multi-sensor only, each individual sensor only, or missed by all, to directly quantify the improvement attributable to multi-sensor integration. The spatial agreement is assessed between the multi-sensor and single sensor approaches for jointly detected flood events. Additionally, the temporal component is characterized by an examination of the observation frequency, maximum observation gaps, and peak capture timing. Lastly, the various detection outcomes are related to event characteristics, including cloud cover persistence, flood magnitude, duration, and flood type, quantifying the conditions under which a multi-sensor approach performs optimally.

How to cite: Campo, C., Tamagnone, P., Schumann, G., Duc Tran, T., Choy, S., and Kuleshov, Y.: Comprehensive validation of the benefits of multi-sensor flood monitoring, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20097, https://doi.org/10.5194/egusphere-egu26-20097, 2026.

Integrated Monitoring of Lake Garda with Radar, Optical Sensors and In Situ Instruments: Insights from the SARLAKES Project

Virginia Zamparelli1, Simona Verde1, Andrea Petrossi1, Gianfranco Fornaro1, Marina Amadori2,3, Mariano Bresciani2, Giacomo De Carolis2, Francesca De Santi4, Matteo De Vincenzi3, Giulio Dolcetti3, Ali Farrokhi3, Raffaella Frank2, Nicola Ghirardi2,5, Claudia Giardino2, Fulvio Gentilin6, Alessandro Oggioni2, Marco Papetti6, Gianluca Pari7 Andrea Pellegrino2, Sebastiano Piccolroaz3, Tazio Strozzi8, Marco Toffolon3, Maria Virelli7, Nestor Yague-Martinez9, and Giulia Valerio6

 

1Institute for Electromagnetic Sensing of the Environment (IREA), National Research Council, Naples, Italy

2Institute for Electromagnetic Sensing of the Environment (IREA), National Research Council, Milan, Italy

3Department of Civil, Environmental and Mechanical Engineering (DICAM), University of Trento, Trento, Italy

4Institute for Applied Mathematics and Information Technologies (IMATI), National Research Council, Milan, Italy

5 Institute for BioEconomy (IBE), National Research Council, Sesto Fiorentino, Italy

6Department of Civil, Environmental, Architectural Engineering and Mathematics (DICATAM), University of Brescia, Brescia, Italy

7Italian Space Agency (ASI), Rome, Italy

8GAMMA Remote Sensing, Gümligen, Switzerland

9Capella Space Corp., San Francisco, CA, USA

 

SARLAKES (SpatiAlly Resolved veLocity and wAves from SAR images in laKES) is a PRIN (Projects of National Interest) project funded in 2022 by the Italian Ministry of University and Research. The project is now in its final phase and is scheduled to end at the beginning of 2026. The project developed a novel, advanced and adaptable tool capable of accurately measuring water dynamics in medium- and large-sized lakes.

A key and innovative aspect of the project is the use of spaceborne Synthetic Aperture Radar (SAR) data, which are widely exploited for routine observation of the marine environments but remain relatively underutilized for lake monitoring. SARLAKES investigated the capability of SAR imagery to retrieve the spatial distribution of wind fields, surface currents, and wind-generated waves in lacustrine environments.

The project considers Lake Garda and Lake Geneva as case studies, with Lake Garda—the largest lake in Italy—selected as the primary test site due to the research group’s long-standing experience and the availability of extensive historical data.

This contribution presents the main results obtained over two years of project activity, with particular emphasis on outcomes from a multidisciplinary field campaign conducted on April 2025. The campaign aimed to reconstruct lake surface currents during a strong wind event in the peri-Alpine Lake Garda region.

The field instrumentation included a wave buoy, an acoustic Doppler current profiler (ADCP), Lagrangian drifters, anemometers, a ground-based radar, fixed cameras, a drone, and a conductivity–temperature–depth profiler. Satellite acquisitions from the COSMO-SkyMed Second Generation and Capella Space SAR sensors, as well as from the optical sensor PRISMA were scheduled over the study area during the campaign. Archive data from Sentinel-1, Sentinel-2, Sentinel-3, Landsat, and COSMO-SkyMed missions were also utilized.

The project demonstrates how the integration of in-situ instrumentation, spatially distributed flow measurements from remote sensing, and hydrodynamic modeling provides a comprehensive and scalable approach to next-generation monitoring of complex lake systems.

How to cite: Zamparelli, V. and the SARLAKES project team: Integrated Monitoring of Lake Garda with Radar, Optical Sensors and In Situ Instruments: Insights from the SARLAKES Project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21000, https://doi.org/10.5194/egusphere-egu26-21000, 2026.

Semi-urban vegetation systems play a critical role in ecosystem stability but are increasingly exposed to flood hazards due to climate variability and rapid land-use change. Accurate flood detection in such system remains challenging because radar backscatter is influenced by complex and mixed scattering mechanisms arising from vegetation, built-up structures, and surface water. Conventional intensity-based flood indices struggle to separate flooded vegetation from non-flooded rough surfaces and tend to miss inundated areas under mixed land-cover conditions. To address these limitations, this study presents a physically interpretable flood detection framework that integrates Synthetic Aperture Radar polarimetric descriptors with a machine learning classifier. The proposed approach utilizes dual-polarized Sentinel-1 SAR data to derive polarimetric features from Stokes parameters and the covariance matrix. Specifically, the Degree of Polarization and Linear Polarization Ratio are combined with eigenvalue-based information to capture changes in both amplitude and polarization state between pre-flood and during-flood conditions. These descriptors are integrated into a novel Flood Index (FI) designed to distinguish flooded urban areas dominated by double-bounce scattering from flooded vegetation characterized by depolarized volume scattering. Unlike commonly used indices such as the Normalized Difference Flood Index (NDFI) or VH/VV ratio, the proposed FI exploits polarization behaviour rather than relying solely on backscatter intensity. A Random Forest classifier is trained on the proposed FI using a tile-based sampling strategy to handle class imbalance between flooded and non-flooded pixels. The framework is evaluated across three flood events representing diverse geographic and land-cover conditions: the 2019 Typhoon Hagibis flood in Japan, the 2023 Yamuna River flood in India, and the 2023 Larissa flood in Greece. Model performance is assessed using multiple accuracy metrics, including F1 score, Intersection over Union (IoU), False Positive Rate (FPR), and False Negative Rate (FNR). Results demonstrate that the Random Forest model trained on the proposed Flood Index consistently outperforms threshold-based Otsu methods and NDFI across all study areas. The approach achieves F1 scores ranging from 0.81 to 0.86 and IoU values between 0.70 and 0.76, while maintaining a relatively low False Negative Rate (0.09-0.17), that is critical for minimizing missed flooded areas in disaster response applications. Sensitivity and ablation analyses further confirm the robustness of the Flood Index to speckle noise and highlight the complementary contribution of its individual components. Overall, the proposed framework offers a transferable and computationally efficient solution for flood mapping in semi-urban vegetation systems using widely available dual-polarized SAR data. The results highlight its potential for scalable flood monitoring and rapid damage assessment across regions with heterogeneous land-cover conditions.

How to cite: Adhikari, R. and Bhardwaj, A.: SAR polarimetry-based machine learning method for flood detection in semi-urban vegetation systems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21063, https://doi.org/10.5194/egusphere-egu26-21063, 2026.

EGU26-21507 | ECS | Posters on site | HS6.5

Flood Susceptibility Mapping with GFI 2.0 and Artificial Intelligence Models 

Jorge Saavedra Navarro, Ruodan Zhuang, Caterina Samela, and Salvatore Manfreda

Floods are among the most damaging natural hazards, motivating the development of rapid and scalable tools for floodplain mapping across multiple return periods and for post-event assessment. The Geomorphic Flood Index (GFI) is widely used to identify flood-prone areas using topographic information, but it can exhibit reduced reliability under complex hydraulic conditions—particularly near confluences where backwater controls water levels—and it may systematically overestimate inundation extents when used as a binary classifier.

This study advances the GFI framework by explicitly accounting for backwater effects at river confluences and along tributary junctions. In parallel, to reduce the intrinsic overestimation of GFI-derived floodplains, we test a suite of Artificial Intelligence (AI) classifiers—Random Forest, XGBoost, and Neural Networks—trained through a multi-parametric formulation that combines GFI with auxiliary predictors, including precipitation, lithology, land use, and slope. The approach is evaluated across multiple Italian catchments, using satellite-derived inundation and hydrodynamic simulations as independent benchmarks. Model performance is quantified against the baseline GFI approach using a standard threshold-based binary classification using an optimal cutoff.

The proposed framework aims to improve post-event flood delineation under observational constraints (e.g., satellite data gaps due to cloud cover, vegetation, or imaging limitations) and to provide a computationally efficient surrogate for extending hydrodynamic information to additional return periods or large basins where full numerical modelling is impractical. Preliminary results indicate that Random Forest provides the most robust performance across study sites. Incorporating backwater effects yields clear gains at confluences, primarily by reducing omission errors and improving the representation of hydraulically controlled inundation patterns. Moreover, the AI-based correction substantially mitigates the overestimation typically associated with standard GFI mapping, resulting in floodplain delineations that are more consistent with complex hydrodynamic processes and suitable for scalable flood hazard applications.

How to cite: Saavedra Navarro, J., Zhuang, R., Samela, C., and Manfreda, S.: Flood Susceptibility Mapping with GFI 2.0 and Artificial Intelligence Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21507, https://doi.org/10.5194/egusphere-egu26-21507, 2026.

EGU26-21622 | ECS | Orals | HS6.5

Mapping and modeling coastal flood dynamics using remote sensing and hydrodynamic models 

Giovanni Fasciglione, Guido Benassai, Gaia Mattei, and Pietro Patrizio Ciro Aucelli

This study presents an integrated and multidisciplinary methodology for investigating coastal flooding and morphodynamic processes in low-lying coastal environments, with a comparative application to two geomorphologically distinct Mediterranean coastal plains: the Volturno Plain and the Fondi Plain. The methodological framework combines high-resolution topographic and bathymetric datasets, aerial remote sensing, sedimentological analyses, statistical wave climate assessment, numerical hydrodynamic modelling, and relative sea-level rise scenarios that incorporate both eustatic trends and local vertical land movements. This approach enables a robust evaluation of how differing coastal configurations influence flooding susceptibility under extreme marine conditions.

For both study areas, the topographic baseline was derived from 2 m resolution LiDAR-based Digital Terrain Models, subsequently refined using site-specific datasets. In the Volturno Plain, extensive GNSS field surveys were conducted along the beach between Volturno and Regi Lagni river mouths. In the Fondi Plain, DTM refinement relied on aerial drone surveys carried out over the beach sector between the Canneto and Sant’Anastasia river mouths. Photogrammetric processing of aerial imagery allowed the generation of high-resolution surface models, which were integrated with the existing LiDAR DTM to enhance the depiction of subtle morphological features critical for flood propagation.

Sedimentological characterization was performed to constrain morphodynamic responses. Granulometric samples were collected along cross-shore transects at elevations ranging from −1.5 m to +2 m. Grain-size distribution analyses supported the calibration and interpretation of sediment transport and wave dissipation processes within numerical models.

Bathymetric modelling was based on high-precision single-beam echo-sounder surveys, with depth data corrected for tidal variations using official tide-gauge records. Emerged and submerged datasets were merged into continuous topo-bathymetric models, ensuring consistency in vertical reference systems and numerical stability.

Marine storms were identified through the analysis of offshore buoy records using a Peak Over Threshold approach. Storm events were classified into five classes using their Storm Power Index calculated by combining significant wave height and event duration. Representative events were selected as boundary conditions for coupled hydrodynamic simulations performed with Delft3D and XBeach. Simulations were run for future scenarios based on high-emission IPCC projections (SSP 5-8.5), integrating local sea-level rise, local subsidence rates, and highest tidal and surge levels.

A comparative analysis of the simulation outcomes highlights marked differences between the two coastal plains. The Volturno Plain results highly prone to inundation, with storm surges overtopping dune systems and propagating inland due to low elevations, local subsidence, and limited effectiveness of existing coastal defenses. Conversely, the Fondi Plain exhibits significantly reduced flood penetration. The presence of a wide bar system, coupled with efficient coastal defense structures, promotes substantial dissipation of incoming wave energy. As a result, even under intense storm conditions, inundation remains confined to a narrow coastal strip immediately landward of the beach.

Overall, the comparative methodological application demonstrates how coastal morphology, sedimentological properties, and defense systems critically control flood dynamics. The proposed framework provides a transferable and decision-oriented tool for assessing coastal vulnerability and supporting adaptation strategies in heterogeneous low-lying coastal settings under climate change pressure.

How to cite: Fasciglione, G., Benassai, G., Mattei, G., and Aucelli, P. P. C.: Mapping and modeling coastal flood dynamics using remote sensing and hydrodynamic models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21622, https://doi.org/10.5194/egusphere-egu26-21622, 2026.

EGU26-21631 | ECS | Posters on site | HS6.5

Assessment of Multi-Mission Satellite Altimetry GDR L2 Products for River Water Surface Elevation in the Ganga Basin 

Barun Kumar, Shyam Bihari Dwivedi, and Shishir Gaur

Precise monitoring of water surface elevation (WSE) in data-deficient areas such as the Ganga River stretch is essential for hydrological modelling, flood prediction, and comprehensive water resource management. This study introduces a comprehensive evaluation framework for Level-2 Geophysical Data Records (GDR L2) derived from various satellite altimetry missions, including Sentinel-3A/B, Sentinel-6A, Jason-3, and SWOT Nadir, validated against in-situ gauge stations from the Central Water Commission (CWC) across a range of hydrological conditions. The process includes advanced geographical analysis. Gaussian-process Kriging interpolation generates continuous longitudinal WSE profiles across strategically placed virtual stations; rigorous outlier detection employs interquartile range (IQR) and Hampel filters; bias correction employs dry-season median alignment to a common orthometric datum; and Kalman filter smoothing effectively reduces measurement noise while preserving critical hydrological signal dynamics.

Comprehensive performance evaluations employ co-located time series analysis, scatter plots, and flow duration curves (FDCs), with seasonal stratification distinguishing monsoon high-flow variability from stable non-monsoon baseflow conditions. The evaluation stresses physically significant parameters based on Kling-Gupta Efficiency (KGE) and RMSE. Sentinel-6A is the strongest performer in all situations with high non-monsoon accuracy (KGE 0.894, RMSE 0.089 m) and monsoon performance (KGE 0.57, RMSE 3.08 m) despite turbulent flow issues, but SWOT Nadir's processing potential is limited by specific hooking artifacts. During non-monsoon periods, measurement reliability is consistently 2-4 times higher. This proven multi-mission system demonstrates satellite altimetry as an operationally viable method for WSE retrieval in major braided rivers, allowing for accurate rating curve generation and discharge computation. In future machine learning data fusion and hydrodynamic modelling can be incorporated to increase basin-scale forecast capabilities.

How to cite: Kumar, B., Dwivedi, S. B., and Gaur, S.: Assessment of Multi-Mission Satellite Altimetry GDR L2 Products for River Water Surface Elevation in the Ganga Basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21631, https://doi.org/10.5194/egusphere-egu26-21631, 2026.

EGU26-21734 | Posters on site | HS6.5

Evaluating Copernicus Global Flood Monitoring (GFM) Service trade-offs in near-real-time flood mapping 

Shagun Garg, Ningxin He, Sivasakthy Selvakumaran, and Edoardo Borgomeo

Near-real-time satellite-based flood maps support disaster risk management and emergency response. One widely used service is the Global Flood Monitoring (GFM) product of the Copernicus Emergency Management Service, launched in 2021 and based on Sentinel-1 Synthetic Aperture Radar (SAR) data. The GFM service combines three flood-mapping algorithms: pixel-based thresholding, region-based approaches, and change-detection techniques, merged using a majority-voting scheme to generate the final flood extent product. Another key strength of the GFM service is its rapid analysis, providing flood maps within approximately five hours of satellite image acquisition through a fully automated processing chain. As the product is increasingly relied upon by practitioners and decision-makers, there is a growing need to assess its accuracy and robustness. Understanding false alarms and missed detections is critical for improving the reliability and usability of the service.


In this study, we systematically compare GFM flood maps across twenty real-world flood events using high-resolution reference datasets. To ensure temporal consistency, the GFM-derived flood maps are generated using Sentinel-1 acquisitions from the same day as the reference observations. Spatial agreement between datasets is quantified using the Intersection-over-Union metric.


Our results suggest that the GFM service performs well for large, extensive flood events but degrades for smaller, localized ones. Many of the observed errors come not from flood detection itself, but from inaccuracies in the reference water layer - while surface water is correctly identified, misclassification of permanent or seasonal water bodies leads to false alarms and missed floods. We evaluate the three-underlying flood-mapping algorithms individually for consistent patterns of misdetection or false alarms. In addition, we develop an automated framework to rapidly compare any external flood map with the GFM outputs, enabling near-instant evaluation of agreement and error patterns. 


This framework provides practical insights into where and why the GFM services achieve successes and failures and offers continuous validation and iterative improvement of global flood mapping services. 

How to cite: Garg, S., He, N., Selvakumaran, S., and Borgomeo, E.: Evaluating Copernicus Global Flood Monitoring (GFM) Service trade-offs in near-real-time flood mapping, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21734, https://doi.org/10.5194/egusphere-egu26-21734, 2026.

EGU26-22077 | Orals | HS6.5

A fully automatic processing chain for the systematic monitoring of surface water using Copernicus Sentinel 1 satellite data: first results of the SCO-CASCADES project. 

Renaud Hostache, Cyprien Alexandre, Chhenglang Heng, Thibault Catry, Vincent Herbreteau, Vannak Ann, Christophe Révillion, and Carole Delenne

Water is essential to life and health of various ecological and social systems. Unfortunately, water is one of the natural resources most impacted by climate change, with increasingly intense hydro-meteorological extremes (floods, droughts, etc.) and growing societal demand. To help manage this vulnerable resource, it is vital to assess and monitor its availability on a regular basis, as well as to track its trajectory over time to better understand the impact of global change on it. Surface water (lakes, rivers, flood plains, etc.) represents an important component of total water resources, and it is of primary importance to monitor it to better understand and manage the consequences of climate change. Surface water resources provide populations around the world with essential ecosystem services such as power generation, irrigation, drinking water for humans and livestock, and space for farming and fishing.

In this context, the SCO-CASCADES project implements end-to-end processing chains for satellite Earth observation data, including Sentinel-1 and 2 (S-1 and S-2), in order to provide surface water products (surface water body and inundation depth maps) that will be made available via an interactive platform co-constructed with identified users.

In the first phase of the project a fully automated Sentinel-1 based processing chain has been implemented. This chain is based on automatic multiscale image histogram parameterization followed by thresholding, region growing and chain detection applied on individual, subsequent pairs, and time series of S1 images. This chain enables us to derive various products: i) an exclusion layer identifying areas where water cannot be detected on Sentinel 1 image (e.g. Urban and forested areas), ii) permanent seasonal water body maps, iii) a water body map for each S1 image, iv) an uncertainty map characterizing the water body classification uncertainty, v) an occurrence map providing the number of times (over the time series) each pixel was covered by open water.

Here, we propose to present and evaluate the robustness of the processing chain and the resulting maps produced using multi-year S1 time series over two large scale sites: the Mekong flood plains between Kratie, the Tonle Sap lake and the Mekong Delta, and the Tsiribihina basin in Madagascar. The kappa score obtained from the comparison between S1 and S2-derived maps shows a good agreement yielding CSI and Kappa Cohen scores most of the time higher than 0.7 and sometimes reaching values higher than 0.9.

How to cite: Hostache, R., Alexandre, C., Heng, C., Catry, T., Herbreteau, V., Ann, V., Révillion, C., and Delenne, C.: A fully automatic processing chain for the systematic monitoring of surface water using Copernicus Sentinel 1 satellite data: first results of the SCO-CASCADES project., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22077, https://doi.org/10.5194/egusphere-egu26-22077, 2026.

EGU26-211 | ECS | Orals | SM3.4

Automatic detection and classification of Nanoseismicity in Distributed Acoustic Sensing data 

Dominic Seager, Jessica Johnson, Lidong Bie, Beatriz De La Iglesia, and Ben Milner

The detection of nanoseismicity (very tiny earthquakes sometimes associated with small cracks in rock, also called acoustic emissions) is an important area of research aiding in the understanding of geophysical processes, hazard detection, material failure and human-driven nanoseismicity. The high frequency and attenuation of nanoseismicity require high-frequency monitoring within metres of the source to capture the event. This has made them difficult to monitor in conditions outside of small-scale lab experiments, in which failure is intentionally induced. The development of distributed acoustic sensing (DAS) as a new tool for seismic monitoring, however, has increased the feasibility of investigating such signals in the field due to its high temporal and spatial resolution. Manual picking of these events, while possible, is impractical for long-term deployments and for time-critical applications such as stability monitoring, which limits the utility of the technology. Automation of the detection of nanoseismic events within such data is therefore essential for the long-term processing of DAS data and real-time processing of data for use in stability monitoring.  

We have developed a pipeline for the automated extraction of nanoseismic events from DAS data, using a new, simple ratio technique called Spatial Short-Term Average (SSTA). The pipeline takes an input of DAS data and generates a series of windows within the data containing information about high amplitude signals relating to nanoseismicity.  

Using the automatically detected events, we labelled the windows to train a series of machine learning models to classify the different signals. Once trained, we evaluated the performance of the various models to select the most effective method for processing the collected data. The best performing models will then be tested at scale with the resulting classified dataset being plotted spatially along the length of the deployment to identify patterns of activity across space and time. 

How to cite: Seager, D., Johnson, J., Bie, L., De La Iglesia, B., and Milner, B.: Automatic detection and classification of Nanoseismicity in Distributed Acoustic Sensing data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-211, https://doi.org/10.5194/egusphere-egu26-211, 2026.

EGU26-893 | ECS | Orals | SM3.4

Optical Interferometry-based seafloor cable Measurements for Rupture Imaging and Tsunami Signal Analysis in the Southwest Pacific 

Amin A. Naeini, Bill Fry, Giuseppe Marra, Max Tamussino, Johan Grand, Jennifer D. Eccles, Kasper van Wijk, Dean Veverka, and Ratnesh Pandit

Optical interferometry on submarine fiber-optic telecommunication cables offers a transformative opportunity for offshore geohazard monitoring by providing continuous measurements of seafloor perturbation at useful intervals over trans-oceanic distances (Marra et al., 2022). We analyze a southwest Pacific subset of data from a section of the Southern Cross NEXT cable connecting Auckland (New Zealand) to Alexandria (Australia). Using only cable-based measurements, we image the seismic rupture kinematics of the 17 December 2024 Mw 7.3 Vanuatu earthquake, the largest seismic event recorded on this cable since its installation.

 

We analyze measurements of a section of cable more than 1,000 km in length and comprising 18 inter-repeater spans including the section that runs roughly parallel to the Vanuatu subduction zone and the adjoining section extending southward toward New Zealand. The earthquake produces clear and coherent arrivals in the optical frequency deviation recorded across multiple spans, with well-defined signatures visible in both time series and spectrograms. We first extract earthquake-related strain signals in the 0.1-0.3 Hz frequency band and apply the Multiple Signal Classification (MUSIC) back-projection technique to recover the source-time evolution of the rupture. The inferred rupture is predominantly bilateral and consistent with the USGS finite-fault solution, confirming that interferometric submarine cables can function as effective regional seismic arrays for rapid characterization of offshore earthquakes.

 

These results further demonstrate the capability of submarine fiber-optic cables to image earthquake rupture processes using high-frequency strain signals, providing valuable monitoring coverage, especially in instrumentally sparse regions such as the southwest Pacific. By resolving rupture kinematics directly, cable-based observations offer a pathway toward improved tsunami early-warning strategies that rely less on empirical magnitude–scaling relations, which are uncertain for large earthquakes. Planned upgrades of the interrogating laser will allow the performance of this approach to be assessed at lower frequencies, where cable-based observations may provide direct constraints on tsunami propagation and other long-period geophysical processes.

How to cite: A. Naeini, A., Fry, B., Marra, G., Tamussino, M., Grand, J., D. Eccles, J., van Wijk, K., Veverka, D., and Pandit, R.: Optical Interferometry-based seafloor cable Measurements for Rupture Imaging and Tsunami Signal Analysis in the Southwest Pacific, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-893, https://doi.org/10.5194/egusphere-egu26-893, 2026.

EGU26-1594 | ECS | Orals | SM3.4

Physics-based earthquake early warning using distributed acoustic sensing 

Itzhak Lior and Shahar Ben Zeev

We present a physics-based point source earthquake early warning system using distributed acoustic sensing (DAS) data. All core modules of the system are based on physical principles of wave propagation, and models that describe the earthquake source and far-field ground motion. The detection-location algorithm is based on time-domain delay-and-sum beamforming, and the magnitude estimation and ground motion prediction are performed using analytical equations based on the Brune omega squared model. We demonstrate the performance of the system in terms of magnitude estimation and ground motion prediction, and in terms of real-time computational feasibility using local 3.1 ≤ M ≤ 3.6 earthquakes. This DAS early warning system allows for fast deployment, circumventing some calibration phases that require gathering local DAS earthquake data before the system becomes operational.

How to cite: Lior, I. and Ben Zeev, S.: Physics-based earthquake early warning using distributed acoustic sensing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1594, https://doi.org/10.5194/egusphere-egu26-1594, 2026.

EGU26-3915 | ECS | Orals | SM3.4

Quasi-static waveform inversion from DAS observations 

Le Tang, Etienne Bertrand, Eléonore Stutzmann, Luis Fabian Bonilla Hidalgo, Shoaib Ayjaz Mohammed, Céline Gélis, Sebastien Hok, Maximilien Lehujeur, Donatienne Leparoux, Gautier Gugole, and Olivier Durand

As a vehicle approaches the fiber-optic cable, the distributed acoustic sensing (DAS) records a broadband strain rate, which corresponds to propagating seismic waves at high frequencies (>1Hz) and to quasi-static strain fields at low frequencies (<1Hz). However, characterizing the subsurface media through quasi-static deformations remains challenging. Here, we propose a new method for imaging shallow urban subsurface structures using quasi-static strain waveforms, measured with fiber-optic cables. This technique utilizes the quasi-static waveform of a single DAS channel to generate a local 1D velocity model, thereby enabling high-resolution imaging of the underground using thousands of densely packed channels. We employed the Markov Chain Monte Carlo (MCMC) inversion strategy to investigate the depth range of inversion using car-induced quasi-static waveforms. The synthetic data demonstrates that the quasi-static strain field generated by a standard small car moving over the ground enables detailed imaging of structures at depths from 0 to 10 meters. Additionally, we conducted field experiments to measure the 2D shear-wave velocity model along a highway using quasi-static strain waveforms generated by a four-wheeled small car. The velocity structure we obtained is closely aligned with that derived from the classical surface-wave phase-velocity inversion. This consistency indicates that the inversion depth range is comparable to the simulation results, which confirms the applicability of this method to real data. In the future, we anticipate using the city's extensive fiber-optic communication network to record quasi-static deformations induced by various types of vehicles, thereby enabling imaging of the urban subsurface at a citywide scale. This will provide valuable insights for the design of urban underground infrastructure and for assessing urban hazards and risks.

How to cite: Tang, L., Bertrand, E., Stutzmann, E., Bonilla Hidalgo, L. F., Mohammed, S. A., Gélis, C., Hok, S., Lehujeur, M., Leparoux, D., Gugole, G., and Durand, O.: Quasi-static waveform inversion from DAS observations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3915, https://doi.org/10.5194/egusphere-egu26-3915, 2026.

EGU26-4163 | Orals | SM3.4

Seismic data telemetry system and precise hypocenter location for distributed acoustic sensing observation using seafloor cable off Sanriku, Japan 

Masanao Shinohara, Shun Fukushima, Kenji Uehira, Youichi Asano, Shinichi S. Tanaka, and Hironori Otsuka

A seismic observation using Distributed Acoustic Sensing (DAS) using seafloor cable can provide spatially high-density data for a long distance in marine areas. A seafloor seismic and tsunami observation system using an optical fiber cable off Sanriku, northeastern Japan was deployed in 1996. Short-term DAS measurements were sporadically repeated since February 2019 using spare fibers of the Sanriku system (Shinohara et al., 2022). A total measurement length is approximately 100 km.  It has been concluded that measurement with a sampling frequency of 100 Hz, a ping rate of 500 Hz, gauge length of 100 m, and a spatial interval of 10 m is adequate for earthquake and tsunami observation.  From March 2025, we started a continuous DAS observation to observe seismic activity. When the continuous DAS observation was commenced, we developed quasi real time data transmission system through the internet. Because a DAS measurement generates a huge mount of data per unit time and capacity of internet is limited, decimation for spatial direction is adopted. In addition, data format is converted from HDF5 to conventional seismic data exchange format in Japan (win format). An interrogator generates a HDF5 file every 30 seconds. After the file generation, the telemetry system reads the HDF5 file, and decimates data for spatial domain. Then, the data format is changed to the win format and the data are sent to the internet. In other words, data transmission is delayed for a slightly greater than 30 seconds. Data with the win format can be applied to various seismic data processing which has been developed before. To locate a hypocenter using DAS data, seismic phases in DAS data must be identified. To evaluate performance of hypocenter location using DAS records, arrival times of P- and S-waves were picked up on the computer display for local earthquakes. Every 100 channel records on DAS data and data from surrounding ordinary seismic stations were used. Location program with absolute travel times and one-dimensional P-wave velocity structure was applied. Results of location of earthquakes were evaluated by mainly using location errors. Errors of the location with DAS data were smaller than those of the location without the DAS data. Increase of arrival data for DAS records seems to be efficient to improve a resolution. However, picking up signals for all channels (seismic station) manually are costly due to a large number of channels. To expand the location method, an improved automatic pick-up program using evaluation function from conventional seismic network data by seismometers for DAS data (Horiuchi et al., 2025) was applied to the DAS data obtained by the Sanriku system. As a result, arrivals time of P, S and converted PS waves can be precisely identified with high resolution. We have a plan to locate earthquakes using all DAS channels (seismic stations)  and surrounding ordinary marine and land seismic stations.

How to cite: Shinohara, M., Fukushima, S., Uehira, K., Asano, Y., Tanaka, S. S., and Otsuka, H.: Seismic data telemetry system and precise hypocenter location for distributed acoustic sensing observation using seafloor cable off Sanriku, Japan, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4163, https://doi.org/10.5194/egusphere-egu26-4163, 2026.

EGU26-4254 | Orals | SM3.4

Using a hybrid seismic and Distributed Acoustic Sensing (DAS) network to study microseismicity in high spatiotemporal resolution offshore of Kefalonia Island, Greece  

Rebecca M. Harrington, Gian Maria Bocchini, Emanuele Bozzi, Marco P. Roth, Sonja Gaviano, Giulio Pascucci, Francesco Grigoli, Ettore Biondi, and Efthimios Sokos

Combining traditional seismic networks with Distributed Acoustic Sensing (DAS) to record ground-motion on telecommunications cables provides new opportunities to study small earthquakes with unprecedented spatial and temporal resolution. Here we present a detailed study of an earthquake sequence offshore northwest of Kefalonia island, Greece that began in March 2024 and returned to background levels by November–December. The sequence was recorded by both a permanent seismic network for its duration and by DAS on a fiber-optic telecommunications cable between 1 - 15 August 2024.  The two-week DAS dataset provides continuous strain measurements along ~15 km of optical fiber between northern Kefalonia and Ithaki during a period that captured elevated seismic activity. Combining seismic station and DAS data reveals distinct physical features of the sequence that are not observable with seismic stations alone, including details of mainshock-aftershock clustering and well-resolved source spectra at frequencies of up to ~50 Hz for M < 3 events. The signal-to-noise-ratio > 3 at frequencies of up to 50 Hz observed on DAS waveforms for a representative group of events suggests consistency with typical earthquake stress-drop values that range from 1-10 MPa. It further suggests that DAS data may be used to augment detailed studies of microearthquake source parameters.

We apply semblance-based detection to DAS waveforms and manually inspect 5,734 earthquakes that occurred within ~50 km of the fiber to build an initial earthquake catalog. We then combine DAS and seismic-station data to locate 284 events with high signal-to-noise ratios and compute their local magnitudes with seismic station data to create a detailed subset of the initial catalog. We apply waveform cross-correlation to offshore DAS data for events in the detailed catalog to associate unlocated detections with template events and estimate relative magnitudes from amplitude ratios and further enhance the detailed catalog. This approach adds an additional 2,496 earthquakes (2,780 events in total) with assigned locations and magnitudes and leads to an enhanced catalog with completeness magnitude Mc = -0.5. Most earthquakes (2,718 of 2780) cluster within a ~5 km radius approximately 10 km offshore of northwestern Kefalonia and exhibit local rates exceeding 100 events per hour.

Our enhanced catalog provides a detailed spatiotemporal record of seismicity in a region with limited station coverage and demonstrates the effectiveness of integrating DAS with seismic networks for earthquake monitoring of active seismic sequences. Furthermore, it resolves details of mainshock–aftershock clustering that would have otherwise likely have been erroneously classified as swarm-like with standard monitoring, highlighting how observational resolution influences the interpretation of the physics driving earthquake sequences.

How to cite: Harrington, R. M., Bocchini, G. M., Bozzi, E., Roth, M. P., Gaviano, S., Pascucci, G., Grigoli, F., Biondi, E., and Sokos, E.: Using a hybrid seismic and Distributed Acoustic Sensing (DAS) network to study microseismicity in high spatiotemporal resolution offshore of Kefalonia Island, Greece , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4254, https://doi.org/10.5194/egusphere-egu26-4254, 2026.

The first commercially available fibre-optic Distributed Acoustic Sensing (DAS) system, Cobolt, was released in 2004, with early uptake driven by applications in perimeter security, pipeline monitoring, and upstream oil and gas operations. Although these deployments demonstrated the disruptive potential of DAS, it is only within the past five years that the geoscience community has widely embraced the technology, exploiting its ability to deliver continuous, high-fidelity measurements with exceptional spatial and temporal resolution.

Historically, commercially available DAS systems were optimised for industrial monitoring rather than scientific metrology. As a result, key requirements of geoscience applications—such as quantitative accuracy, extreme sensitivity, extended range, and robustness in challenging environments—were not primary design drivers. This situation is now changing rapidly as geoscience applications mature and expand. This contribution reviews the principal performance characteristics that define the suitability of modern DAS systems for geoscience research and examines how recent technological developments are addressing these needs.

Five performance parameters are of particular importance. First, the transition from amplitude-based, qualitative DAS to phase-based, quantitative systems has enabled true strain-rate and strain measurements suitable for metrological applications. Second, instrument sensitivity has improved by several orders of magnitude, with contemporary systems achieving pico-strain-level detection along standard telecom fibre. Third, measurement range—ultimately limited by available backscattered photons in pulsed DAS—has been extended beyond 150 km through the adoption of spread-spectrum interrogation techniques. Fourth, spatial resolution continues to improve, with gauge lengths of ≤1 m and sampling intervals of ≤0.5 m now routinely achievable, and further reductions anticipated. Finally, dynamic range remains a critical consideration for high-amplitude signals such as earthquakes; however, reductions in gauge length provide a clear pathway to mitigating cycle-skipping limitations, supporting the future use of DAS in Earthquake Early Warning (EEW) systems.

Alongside raw performance, the ability to quantify and compare DAS system capabilities has become increasingly important. Industry-led efforts have resulted in well-defined test methodologies and performance metrics, providing a common framework for objective evaluation of DAS instruments used in scientific studies.

Practical deployment considerations are also shaping system design. Reduced size, weight, and power (SWaP) enable operation in remote and hostile environments, while improved reliability, passive cooling, and environmental sealing facilitate long-term field installations. These advances are particularly relevant to emerging marine and subsea applications, where low-power, marinised DAS systems are required for seabed deployment.

Finally, the growing complexity of DAS instrumentation places increasing emphasis on software. Automated configuration, intuitive user interfaces, and integrated edge-processing capabilities are becoming essential to ensure that non-specialist users can reliably extract high-quality scientific data.

Together, these developments signal a transition in DAS from an industrial monitoring tool to a mature geoscience instrument, with continued innovation expected to further expand its role across solid-Earth, cryospheric, and marine research over the coming decade.

How to cite: Hill, D.: DAS design features critical to geoscience applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4295, https://doi.org/10.5194/egusphere-egu26-4295, 2026.

EGU26-4413 | ECS | Posters on site | SM3.4

Coherent Source Subsampling of Seismic Noise for Distributed Acoustic Sensing in the Swiss Alps 

Sanket Bajad, Daniel Bowden, Pawan Bharadwaj, Elliot James Fern, Andreas Fichtner, and Pascal Edme

Distributed Acoustic Sensing (DAS) provides dense measurements of seismic noise along fiber-optic cables and offers new opportunities for subsurface characterization. In environments where controlled sources are unavailable, conventional noise interferometry workflows for DAS construct virtual shot gathers via cross-correlation and average them over long time windows to obtain coherent surface waves for dispersion analysis and subsequent shear-wave velocity (Vs) inversion. In noise-based interferometric imaging, the distribution of noise sources controls the quality of the retrieved interstation response. In practice, seismic sources are highly anisotropic and intermittent, and so simply averaging all available time windows produces interferometric responses that are difficult to interpret and lead to unstable dispersion curves and biased Vs estimates. We present a data-driven coherent source subsampling (CSS) framework that automatically identifies and selects the time windows of seismic noise that contribute constructively to the physically interpretable interstation response.

We demonstrate the method using DAS data acquired along 30 km of pre-existing telecommunication fiber deployed by the Swiss Federal Railways (SBB) in a major alpine valley floor, recorded with a Sintela interrogator at 3 m channel spacing with 6 m gauge length. Our objective is to recover stable Rayleigh-wave dispersion curves and a shallow Vs structure in the upper 50 m. The fiber runs along the railway track in surface cable ducts, providing a realistic test bed with complex ambient noise, including car traffic, factories, quarry blasts, in addition to the train-generated signals. Subsampling strategies based on prior knowledge of the sources, such as train schedules or velocity-based filtering, can partly mitigate this problem. However, these strategies are tedious, strongly location-dependent along the fiber, and do not guarantee that the retained windows contribute coherently to the interstation response of the segment under investigation.

Here, we use a symmetric variational autoencoder (SymVAE) to perform coherent source subsampling. Trained on virtual shot gathers from multiple time windows, the SymVAE groups windows according to the similarity of their correlation wavefields and enables the selection of those windows that consistently exhibit symmetric surface-wave contributions on both the causal and acausal sides. Averaging only these subsampled windows yields interstation responses that are substantially denoised and symmetric. We interpret these cleaner and symmetric cross-correlations as being associated with the stationary-phase contributions for the fiber segment under investigation. The same framework also identifies fiber segments that lack coherent, dispersive Rayleigh waves, indicating where robust subsurface imaging is not feasible.

Applying CSS to the SBB DAS data produces stable Rayleigh-wave dispersion curves along the cable, which we invert for two-dimensional Vs profiles. Although demonstrated here on railway-generated noise, the proposed CSS framework can be extended to any uncontrolled settings, such as road-traffic-dominated areas, where source variability and non-uniformity may be even more severe.

  • 1Centre for Earth Sciences, Indian Institute of Science, Bangalore, India
  • 2Department of Earth and Planetary Sciences, ETH Zurich, 8092 Zurich, Switzerland
  • 3 SBB CFF FFS

 

How to cite: Bajad, S., Bowden, D., Bharadwaj, P., Fern, E. J., Fichtner, A., and Edme, P.: Coherent Source Subsampling of Seismic Noise for Distributed Acoustic Sensing in the Swiss Alps, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4413, https://doi.org/10.5194/egusphere-egu26-4413, 2026.

EGU26-4603 | ECS | Orals | SM3.4

What Controls Variability in DAS Earthquake Observations? Implications for Ground-Motion Models 

Chen-Ray Lin, Sebastian von Specht, and Fabrice Cotton

Distributed Acoustic Sensing (DAS) provides dense, meter-scale ground-motion measurements along fiber-optic cables. However, developing ground-motion models (GMMs) from DAS data is challenging because observations are controlled by DAS-specific factors such as cable coupling, orientation, and channel correlation. In this study, we present the first regional, partially non-ergodic DAS-based GMM that explicitly identifies and quantifies cable-related contributions to ground-motion variability. We analyze strain-rate data from a 400-channel DAS array at the Milun campus in Hualien City, Taiwan, compiling peak strain rates and Fourier amplitudes (0.1–10 Hz) from 77 regional earthquakes (3<M<7, 45<R<170 km). Building on classical seismometer-based GMMs, we extend the variability framework to account for (1) cable coupling influenced by installation and environment types, (2) cable orientation, and (3) channel correlation inherent to DAS measurement principles and array geometry. Channel correlation is modeled using Matérn kernels parameterized by along-fiber and spatial proximity distances. The resulting DAS-based GMM shows magnitude-distance scaling comparable to classical models, while decomposing variability into physically interpretable components. Cable coupling emerges as a dominant broadband source of within-event variability, whereas orientation effects capture repeatable, frequency-dependent earthquake source radiation patterns. Modeling channel correlation significantly reduces channel-related standard deviations, demonstrating that treating DAS channels as independent observations biases uncertainty estimates. Overall, our results show that DAS-derived ground motions require a fundamentally different variability framework than that of classical GMMs, highlighting the importance of deployment metadata and correlation modeling. This approach provides a statistical and physical foundation for next-generation seismic hazard assessments using dense fiber-optic sensing.

How to cite: Lin, C.-R., von Specht, S., and Cotton, F.: What Controls Variability in DAS Earthquake Observations? Implications for Ground-Motion Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4603, https://doi.org/10.5194/egusphere-egu26-4603, 2026.

Monitoring fin whale (Balaenoptera physalus) vocalizations is of significant scientific importance and practical value for marine ecology, hydroacoustics, and geophysics. Conventional monitoring approaches, such as hydrophone arrays, ocean-bottom seismometers (OBS), and satellite tagging, are limited by sparse spatial coverage, potential biological disturbance, and high costs. Distributed acoustic sensing (DAS) is an emerging technology that utilizes submarine optical cables as dense acoustic arrays, providing opportunities for large-scale, high-resolution monitoring of whale vocalizations. Here, we reveal the wavefield features of fin whale vocalizations by integrating DAS observational data combined with numerical simulations. Three distinct features—Insensitive response segment (IRS), high-frequency component loss, and acoustic notch—were identified in the observed wavefield. DAS response analysis via ray-acoustic modeling indicates that the length of the IRS is positively correlated with the vertical source-to-cable distance, while the gauge length is responsible for the high-frequency loss in Type-B calls. Furthermore, wavefield simulations using the spectral-element method (SEM) demonstrate that the acoustic notches represent transitions between transmission zones of waterborne multipath waves entering the seafloor, exhibiting high sensitivity to the seafloor P-wave velocity, water depth, and topography. These findings not only enhance our understanding of the DAS-observed wavefields, but also highlight the potential of utilizing DAS and acoustic notches for ocean environmental parameter estimation.

How to cite: Wang, Q.: Revealing the Wavefield Features of Fin Whale Vocalizations Observed by Distributed Acoustic Sensing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4625, https://doi.org/10.5194/egusphere-egu26-4625, 2026.

This study aims to develop a system for the identification of vessels, seismic events, and volcanic activity through analysis of the spatiotemporal characteristics of wavefields recorded by distributed acoustic sensing (DAS) using a submarine fiber-optic cable. DAS provides unprecedented spatial coverage and resolution, making it highly suitable for monitoring dense wavefield variations and anthropogenic activities, whereas traditional seismometers remain indispensable for quantitative seismic analysis and low-frequency observations. In this study, continuous DAS records acquired from a submarine fiber-optic cable located in the northeastern offshore region of Taiwan near Guishan Island, an active volcano. This region experiences frequent seismic activity due to the northwestward subduction of the Philippine Sea Plate beneath the Eurasian Plate. In addition, the passage of the Kuroshio Current, a warm ocean current, brings abundant fish resources, resulting in frequent activities of fishing vessels and whale-watching boats. Event detection is first carried out using the recursive short-time-average/long-time-average (STA/LTA) method which uses two time windows with different durations and computes the average signal amplitude within each window. When a signal arrives, the average amplitude within a short time window changes rapidly, thereby increasing the ratio of the short-time average to the long-time average. An event is detected when this ratio exceeds a predefined threshold and manual secondary inspected. However, low signal-to-noise ratios (SNR) can significantly reduce the sensitivity of STA/LTA-based detection, leading to missed events. To overcome this problem, signal processing adjustments were applied to enhance detection performance. To validate the detection performance, the detected ship-related events were compared with records from the Automatic Identification System (AIS), while earthquake events identified from the DAS data were compared with the earthquake catalog of Taiwan Seismological and Geophysical Data Management System (GDMS). Subsequently, a regression analysis of catalog magnitudes against hypocentral distance and maximum DAS-recorded amplitude was applied to determine the minimum detectable earthquake magnitude. The proposed framework demonstrates the potential of DAS as a complementary tool for offshore geophysical and maritime monitoring, providing a basis for future studies on vessel tracking, seafloor topography, and earthquake monitoring.

How to cite: Wei, Y. J. and Chan, C. H.: Application of Distributed Acoustic Sensing to Detect and Identify of Vessels and Natural Events in the Northeastern Offshore Region of Taiwan, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4712, https://doi.org/10.5194/egusphere-egu26-4712, 2026.

EGU26-5156 * | Orals | SM3.4 | Highlight

Englacial ice quake cascades in the Northeast Greenland Ice Stream - Observations and implications of ice stream dynamics 

Andreas Fichtner, Coen Hofstede, Brian Kennett, Anders Svensson, Julien Westhoff, Fabian Walter, Jean-Paul Ampuero, Eliza Cook, Dimitri Zigone, Daniela Jansen, and Olaf Eisen

Ice streams are major contributors to ice sheet mass loss and critical regulators of sea level change. Despite their important, standard viscous flow simulations of ice stream deformation and evolution have limited predictive power, mostly because our understanding of the involved processes is limited. This leads, for instance, to widely varying predictions of sea level rise during the next decades.

 

Here we report on a Distributed Acoustic Sensing experiment conducted in the borehole of the East Greenland Ice Core Project (EastGRIP) on the Northeast Greenland Ice Stream. For the first time, our observations reveal a brittle deformation mode that is incompatible with viscous flow over length scales similar to the resolution of modern ice sheet models: englacial ice quake cascades that are not being recorded at the surface. A comparison with ice core analyses shows that ice quakes preferentially nucleate near volcanism-related impurities, such as thin layers of tephra or sulfate anomalies. These are likely to promote grain boundary cracking, and appear as a macroscopic form of crystal-scale wild plasticity. A conservative estimate indicates that seismic cascades are likely to produce strain rates that are comparable in amplitude to those measured geodetically, thereby bridging the well-documented gap between current ice sheet models and observations.

How to cite: Fichtner, A., Hofstede, C., Kennett, B., Svensson, A., Westhoff, J., Walter, F., Ampuero, J.-P., Cook, E., Zigone, D., Jansen, D., and Eisen, O.: Englacial ice quake cascades in the Northeast Greenland Ice Stream - Observations and implications of ice stream dynamics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5156, https://doi.org/10.5194/egusphere-egu26-5156, 2026.

We present a back-projection based earthquake location method tailored to Distributed Acoustic Sensing (DAS) arrays, using short overlapping fiber segments and a combined P–S framework to reliably locate local earthquakes. A 66km quasi-linear telecommunication fiber in Israel was repurposed as a DAS array. We analyzed several local earthquakes with varying source–array geometries. We divided the fiber into overlapping 5.4 km segments and back-projected P- and S-wave strain-rate recordings using a local 1D velocity model over a regional grid of potential earthquake locations. Each grid point is assigned with P- and S-phase semblance, and the corresponding phase-specific origin times, associated with the timing of maximum semblance. Segment-specific P- and S-phase semblance maps and the difference between P and S origin times were combined through a weighting scheme that favors segments with spatially compact high-semblance regions. The objective is maximizing both P- and S-wave semblance and minimizing P- and S-wave origin time discrepancies. Results for the analyzed earthquakes reveal robust constraints on both azimuth and epicentral distance from the fiber, and demonstrate the ability to mitigate DAS-related artifacts associated with broadside sensitivity and reduced coherency. We demonstrated the potential of the approach for real-time earthquake location and showed its performance when only P-wave recordings are available, underscoring the method’s potential for future DAS-based earthquake early warning implementation.

How to cite: Noy, G., Ben Zeev, S., and Lior, I.: Earthquake Location using Back Projection with Distributed Acoustic Sensing with Implications for Earthquake Early Warning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5259, https://doi.org/10.5194/egusphere-egu26-5259, 2026.

EGU26-5274 | ECS | Orals | SM3.4

Spectral analysis of background and transient signals at Mount Etna using rectilinear fibre-optic segments 

Hugo Latorre, Sergio Diaz-Meza, Philippe Jousset, Sergi Ventosa, Arantza Ugalde, Gilda Currenti, and Rafael Bartolomé

Etna is the largest, most active and closely monitored volcano in Europe,
making it a crucial study region for volcanology and geohazard assessment. In early
July 2019, a 1.5 km fibre-optic cable was deployed near the summit of Mount Etna
and interrogated for two months. The cable was divided into four main segments, two
of which point towards different active crater areas. Temporary seismic broadband
stations and infrasound sensors were also deployed along the cable. During the
experiment, three distinct eruptive events were recorded. The first two events are
characterised by a large number of explosions in the active crater area, together with
an increase in background tremor activity. The third event is characterised by a larger
increase in background tremor, but almost no explosions.

The continuous recordings are analysed in the frequency-wavenumber domain,
which reveals the features of the background tremor activity and the stacked transient
signals, such as explosions. During the first two eruptive events, the stack of
explosive sources is characterised by a non-dispersive arrival, travelling with
different apparent velocities along each segment, and a non-linear ground response up
to 25 Hz. These segments can be used as an antenna to estimate an average back-
azimuth for the explosions, which come from the same crater area during both
eruptive events.

Outside of the three eruptive events, the background tremor features two slow
dispersion modes, both well resolved on the raw recordings. The slowest mode is
affected by gauge-length attenuation at higher frequencies, due to its short
wavelength, but remains detectable up to 27 Hz, with group velocities as low as 170
m/s. These observations showcase the utility of simple, rectilinear geometries in
deployments despite their known shortcomings, such as in location procedures. For
known source regions, such as volcanoes, a well-oriented segment can leverage
continuous activity to record the incoming wavefield and extract dipersion curves
without the need to perform cross-correlations, simplifying the workflow.

How to cite: Latorre, H., Diaz-Meza, S., Jousset, P., Ventosa, S., Ugalde, A., Currenti, G., and Bartolomé, R.: Spectral analysis of background and transient signals at Mount Etna using rectilinear fibre-optic segments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5274, https://doi.org/10.5194/egusphere-egu26-5274, 2026.

EGU26-5880 | ECS | Posters on site | SM3.4

Enhancing High-frequency Ambient Noise for shallow subsurface imaging using urban ambient noise DAS recordings 

Leila Ehsaninezhad, Christopher Wollin, Verónica Rodríguez Tribaldos, and Charlotte Krawczyk

Distributed Acoustic Sensing (DAS) enables unused fiber optic cables in existing telecommunication networks, known as dark fibers, to function as dense arrays of virtual seismic receivers. Seismic waves generated by human activities and recorded by dense sensor networks provide an abundant, high-frequency energy source for high-resolution, non-invasive imaging of the urban subsurface. This approach enables detailed characterization of near-surface soils, sediments, and shallow geological structures with minimal surface impact, supporting applications such as groundwater management, site response and seismic amplification analysis, seismic hazard assessment, geothermal development, and urban planning. However, extracting coherent seismic signals from complex urban noise is challenging due to uneven source distribution, uncertain fiber deployment conditions, and variable coupling between the fiber and the ground. In particular, high-frequency range signals (e.g., above 4 Hz), needed to resolve shallow subsurface structures, are particularly difficult to recover. Two strategies can be used to address some of these challenges, by discarding poor quality seismic noise segments or by focusing on particularly favorable noise sources. In this study, we adopt the second approach and use vibrations generated by passing vehicles, particularly trains which are energetic sources that contain valuable high frequency information . Capturing and exploiting the seismic waves generated by these vehicles offers unique opportunities for efficient and high resolution urban seismic imaging.

We present an enhanced ambient noise interferometry workflow designed to exploit noise sources that are particularly favorable to the fiber geometry, i.e. transient and strong sources occurring at the edge of the fiber segment to be analyzed. The workflow is applied to traffic-dominated seismic noise recorded on a dark fiber deployed along a major urban road in Berlin, Germany. First, we select short seismic noise segments that contain signals from passing trains and then apply a frequency–wavenumber filter to isolate the targeted train-generated surface waves while suppressing other wavefield contributions. The filtered data is then processed using a standard interferometric approach based on cross-correlations to retrieve coherent seismic phases from ambient noise, producing virtual shot gathers. Finally, Multichannel Analysis of Surface Waves is applied to derive one dimensional velocity models. This workflow targeted on specific transient sources reduces computational cost while enhancing dispersion measurements particularly at higher frequencies. By stacking the responses from tens of tracked vehicles, enhanced virtual shot gathers can be obtained and inverted to improve shallow subsurface models. This can be achieved with only a few hours of seismic noise recording, which is challenging using conventional ambient noise interferometry workflows.

How to cite: Ehsaninezhad, L., Wollin, C., Rodríguez Tribaldos, V., and Krawczyk, C.: Enhancing High-frequency Ambient Noise for shallow subsurface imaging using urban ambient noise DAS recordings, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5880, https://doi.org/10.5194/egusphere-egu26-5880, 2026.

EGU26-6600 | ECS | Posters on site | SM3.4

Multi-fiber Distributed Acoustic Sensing for Urban Seismology in Athens, Greece 

Mohammed Almarzoug, Daniel Bowden, Nikolaos Melis, Pascal Edme, Adonis Bogris, Krystyna Smolinski, Angela Rigaux, Isha Lohan, Christos Simos, Iraklis Simos, Stavros Deligiannidis, and Andreas Fichtner

Distributed Acoustic Sensing (DAS) offers a promising approach for dense seismic recording in urban environments by repurposing existing telecommunication infrastructure. Athens presents an ideal setting for such an approach, as Greece is one of the most seismically active countries in Europe, and the Athens metropolitan area — home to nearly four million inhabitants — lies within a geologically complex basin whose vulnerability was demonstrated by the destructive 1999 Mw 5.9 Parnitha earthquake. Seismic hazard assessment requires accurate subsurface velocity models, but acquiring the data to build them in dense urban areas remains challenging.

We present results from a multi-fiber DAS experiment conducted in Athens, Greece, from 16 May to 30 June 2025, using four telecommunication fibers provided by the Hellenic Telecommunications Organisation (OTE). Two Sintela ONYX interrogators simultaneously interrogated the four fibers, which fan out from an OTE building with lengths of approximately 24, 38, 42, and 48 km, providing extensive azimuthal coverage of Athens. This makes the study one of the largest urban DAS campaigns ever performed.

Data were acquired in two configurations, a lower spatial resolution mode optimised for earthquake recording (~26 days) and a higher resolution mode for ambient noise interferometry (~19 days). To detect seismic events, we applied bandpass filtering followed by phase-weighted stacking across channels to enhance coherent arrivals. An STA/LTA (short-time average/long-time average) trigger was then used to identify candidate events. During the acquisition period, the National Observatory of Athens (NOA) recorded 2,645 events across the broader seismic network, of which 548 were detected on at least one fiber (368, 343, 328, and 322 on fibers 1–4, respectively). Detection capability depends on distance and magnitude — we achieve near-complete detection within ~20 km, while many events of ML ≥ 2 were recorded at distances exceeding 200 km. The array also captured small local events absent from the NOA catalogue, likely corresponding to local seismicity below the detection threshold of the sparser regional network. Characterising this unobserved local seismicity is one of the objectives of ongoing work.

For events within 50 km of the interrogator site, we pick P- and S-wave arrivals to constrain body-wave travel times. These picks are used to locate events in the NOA catalogue, which enables us to compare with network-derived hypocentres and allows us to assess potential improvement from the dense DAS coverage, before applying the approach to smaller events detected only by DAS. The travel-time data will also serve as input for 3D eikonal traveltime tomography to image subsurface velocity structure beneath metropolitan Athens.

How to cite: Almarzoug, M., Bowden, D., Melis, N., Edme, P., Bogris, A., Smolinski, K., Rigaux, A., Lohan, I., Simos, C., Simos, I., Deligiannidis, S., and Fichtner, A.: Multi-fiber Distributed Acoustic Sensing for Urban Seismology in Athens, Greece, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6600, https://doi.org/10.5194/egusphere-egu26-6600, 2026.

EGU26-6949 | ECS | Posters on site | SM3.4

SAFE - Tsunami early warning system using available seafloor fiber cables with Chirped-pulse DAS 

Javier Preciado-Garbayo, Jaime A. Ramirez, Alejandro Godino-Moya, Jorge Canudo, Diego Gella, Jose Maria Garcia, Yuqing Xie, Jean Paul Ampuero, and Miguel Gonzalez-Herraez

Traditional tsunami early warning systems (TEWS) are typically expensive, have limited real-time availability, require continuous maintenance, and involve long deployment times. The SAFE project aims to overcome these limitations by developing a new tsunami warning technology based on Distributed Acoustic Sensing (DAS), leveraging existing seafloor fiber optic cables. This approach offers continuous 24/7 monitoring, near-zero maintenance, faster response times, and ease of installation. The project includes contributions ranging from the development of a novel Chirped-pulse DAS interrogator (HDAS) with improved low-frequency performance to a novel post-processing software to obtain tide height from the measured seafloor strain and automatic detection and confirmation of a tsunami wave. All this has been implemented in a friendly user interface and is undergoing final evaluation by the tsunami warning authority in the NE Atlantic (the Instituto Português do Mar e da Atmosfera, IPMA).  

The validation is currently ongoing using the ALME subsea cable, which connects Almería and Melilla across the Alboran Sea. The interrogator has demonstrated the ability to detect swell waves with a maximum error of 20 cm in the deep sea and a post-processing response time of less than 90 seconds. It is expected that slower tsunami waves will yield more precise estimations of wave height.

Importantly, the technology could also successfully detect the 5.3 Mw earthquake near Cabo de Gata, Spain, on July 14, 2025, at a distance of only 40 km from the epicenter without major saturation. The extremely large dynamic range of the interrogator (approximately 10 times larger than a usual phase system) enables the system to monitor large-magnitude earthquakes without signal clipping. The SAFE system is capable of delivering critical seismic and hydrodynamic data within 5 minutes of an event, supporting early tsunami detection and rapid response.

How to cite: Preciado-Garbayo, J., A. Ramirez, J., Godino-Moya, A., Canudo, J., Gella, D., Garcia, J. M., Xie, Y., Ampuero, J. P., and Gonzalez-Herraez, M.: SAFE - Tsunami early warning system using available seafloor fiber cables with Chirped-pulse DAS, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6949, https://doi.org/10.5194/egusphere-egu26-6949, 2026.

EGU26-7247 * | ECS | Orals | SM3.4 | Highlight

Submarine Cable Optical Response to Seismic Waves: Insights from Controlled-Environment Tests 

Max Tamussino, David M. Fairweather, Ali Masoudi, Zitong Feng, Richard Barham, Neil Parkin, David Cornelius, Gilberto Brambilla, Andrew Curtis, and Giuseppe Marra

Fibre-optic sensing technology is transforming seafloor monitoring by enabling dense, continuous measurements across vast distances using existing telecommunication infrastructure. Distributed acoustic sensing (DAS) and optical interferometry [1] have demonstrated remarkable potential for earthquake detection, ocean dynamics monitoring, and hazard early warning. However, for these technologies to be used for these applications, the transfer function between environmental perturbations and measured optical signal changes in submarine cables needs to be known.

We present the, to the best of our knowledge, first controlled-environment characterisation of submarine cable responses to active seismic and acoustic sources, comparing DAS and optical interferometry measurements with ground-truth data from 58 geophones, 20 three-component seismometers, and microphones [2]. Our results reveal three key findings:

  • In contrast with proposed theoretical models [3], our interferometric measurements show first-order sensitivity to broadside seismic sources, enabling localisation of arrivals along straight fibre links.
  • We identify a previously unreported fast-wave phenomenon, attributed to seismic energy coupling into the cable's metal armour and propagating at velocities exceeding 3.5 km/s, significantly altering recorded waveforms.
  • We compared measurements between adjacent fibres within the same cable. Results show significant discrepancies between the measured waveforms, which should be considered in applications operating in a similar frequency range as our tests.

These findings show the complexity of submarine cable mechanics and their impact on optical sensing performance. Understanding these processes is critical for calibrating transfer functions and improving the reliability of fibre-based geophysical observations.  In addition to these findings, we also discuss the limitations of our methodology, which primarily arise from the limited range of seismic source frequencies available. Our work presents a first step towards understanding the complex transfer function of environmental perturbations to optical signals in subsea cables, advancing the vision of large-scale, cost-effective Earth observation systems.

[1] Marra, G. et al. Optical interferometry–based array of seafloor environmental sensors using a transoceanic submarine cable. Science 376 (6595), 874–879 (2022)

[2] Fairweather, D.M., Tamussino, M., Masoudi, A. et al. Characterisation of the optical response to seismic waves of submarine telecommunications cables with distributed and integrated fibre-optic sensing. Sci Rep 14, 31843 (2024)

[3] Fichtner, A., Bogris, A., Nikas, T. et al. Theory of phase transmission fibre-optic deformation sensing. Geophysical Journal International, 231(2), 1031–1039, (2022)

 

How to cite: Tamussino, M., Fairweather, D. M., Masoudi, A., Feng, Z., Barham, R., Parkin, N., Cornelius, D., Brambilla, G., Curtis, A., and Marra, G.: Submarine Cable Optical Response to Seismic Waves: Insights from Controlled-Environment Tests, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7247, https://doi.org/10.5194/egusphere-egu26-7247, 2026.

EGU26-7298 | ECS | Orals | SM3.4

Coastal Ambient Noise and Microseismic Monitoring with Distributed Acoustic Sensing: a Case Study from Norfolk, UK 

Harry Whitelam, Lidong Bie, Jessica Johnson, Andres Payo Garcia, and Jonathan Chambers

Seismic ambient noise is a ubiquitous and constant resource, ideal for non-invasive investigations of the solid earth. Coastlines around the world are handling an increase in coastal erosion due to sea level rise and more energetic storms. Monitoring this is becoming an increasingly necessary task to protect coastal settlements. Using Distributed Acoustic Sensing in seismic monitoring has already shown incredible potential and offers the advantage of dense measurements. Our project seeks to identify the efficacy of Distributed Acoustic Sensing for monitoring subsurface changes which precede cliff failure. We present early findings from the first long-term deployment of a fibre optic cable along the coastline - North Sea, Norfolk, UK. We investigate differences in signal characteristics between conventional seismometers and Distributed Acoustic Sensing in this setting, and interpret the seismic signatures of key sources in the area. This deployment was recording for 22 months, allowing us to monitor both short-term and seasonal changes. We identify the frequency ranges excited by storm events (0.2 - 1 Hz), the dominance of short-period secondary microseismic activity, and the importance of local sea state and weather on influencing higher frequency signals. We also discuss limitations of Distributed Acoustic Sensing and the sources it can not reliably capture when compared to broadband seismometers and nodal geophones. We conclude by discussing how this noise analysis affects the use of ambient noise tomography for seismic velocity monitoring. Future research will test the efficacy of such applications, with the hope of providing better estimates of coastal recession and identifying hazardous areas on a metre-scale.

How to cite: Whitelam, H., Bie, L., Johnson, J., Payo Garcia, A., and Chambers, J.: Coastal Ambient Noise and Microseismic Monitoring with Distributed Acoustic Sensing: a Case Study from Norfolk, UK, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7298, https://doi.org/10.5194/egusphere-egu26-7298, 2026.

EGU26-7427 | ECS | Orals | SM3.4

Distributed Fiber-Optic Sensing for Strain and Temperature Monitoring in an Underground Mine to Enable Digital Twin Integration 

Michael Dieter Martin, Nils Nöther, Erik Farys, Massimo Facchini, and Jens-André Paffenholz

The aim of this study is to assess the potential of distributed fiber-optic sensors for measuring strain and temperature in order to monitor the structural integrity of underground mining drifts and chambers. The work is conducted within the framework of the project “Model coupling in the context of a virtual underground laboratory and its development process” (MOVIE). The overall MOVIE project aim is intended to support the creation of a digital twin, thereby improving safety and operational efficiency through enhanced digital planning across various mining environments. Time-dependent, spatially distributed temperature and rock deformation data will be recorded along fiber-optic sensing cables. These measurements will serve as boundary conditions for integrated geometrical and geomechanical models of the drift and chambers. In the initial phase, a 60-meter-long drift is instrumented using fiber-optic Brillouin-based Distributed Temperature and Strain Sensing (DTSS). Based on laboratory tests and considering the specific environmental conditions of the subsurface mine, i.e., ambient temperature variations, surface roughness, dust, and humidity, the optimal adhesive bonding materials and technique for direct cable installation on gneiss host rock was identified and successfully implemented. Following the initial monitoring setup, further experimental investigations are planned, including the monitoring of induced deformations in yielding arch support, rock bolts and the rock in contact with a hydraulic prop. The drift geometry and the spatial location of the fiber-optic cables within the drift are given by a 3D point cloud. Therefore, a 3D point cloud was captured after the fiber-optic cable installation using a high-end mobile mapping SLAM platform geo-referenced in a project-based coordinate frame. The locations of the geo-referenced fiber-optic cables will be correlated with the acquired DTSS measurements along the fiber-optic sensing cables. Ultimately, the meshed 3D point cloud will serve as foundational input for the combined geometrical and geomechanical model, forming the basis for a virtual reality-compatible digital twin enriched with real-time sensor data.

How to cite: Martin, M. D., Nöther, N., Farys, E., Facchini, M., and Paffenholz, J.-A.: Distributed Fiber-Optic Sensing for Strain and Temperature Monitoring in an Underground Mine to Enable Digital Twin Integration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7427, https://doi.org/10.5194/egusphere-egu26-7427, 2026.

EGU26-7462 | Orals | SM3.4

Marine Distributed Acoustic Sensing (DAS) for Detection of Submarine CO₂ Bubble Emissions: Insights from a Shallow-Water Volcanic Setting at Panarea (Italy) 

Cinzia Bellezza, Fabio Meneghini, Andrea Travan, Luca Baradello, Michele Deponte, and Andrea Schleifer

Fibre-optic sensing technologies are rapidly transforming geophysical monitoring by enabling spatially dense, temporally continuous observations of seismic and acoustic wavefields in environments that are difficult to instrument with conventional sensors. In marine settings, Distributed Acoustic Sensing (DAS) applied to seabed fibre-optic cables offers new opportunities for low-impact monitoring of fluid and gas migration processes, which are fundamental both to volcanic–hydrothermal systems and to emerging offshore carbon capture and storage (CCS) applications.

In this study, we investigate the feasibility of marine DAS for detecting natural and artificial CO₂ bubble emissions in a shallow-water volcanic environment offshore Panarea (Aeolian Islands, Italy). Panarea hosts the OGS NatLab Italy, part of ECCSEL-ERIC, thanks to its active submarine degassing associated with a hydrothermal system and therefore represents a natural laboratory and an analogue site for potential subseabed CO₂ leakage scenarios. A 1.1-km-long armored fibre-optic cable was deployed on the seabed and interrogated using two different DAS systems, providing continuous passive acoustic and seismic recordings. To support signal identification and interpretation, the DAS data were complemented by controlled gas releases from scuba tanks, by a High Resolution Seismic (boomer) survey and side-scan sonar imaging, to characterize seabed morphology and shallow subsurface structures along the cable route.

The DAS recordings revealed acoustic signatures associated with both natural CO₂ bubble emissions and controlled artificial releases. Bubble-related signals were detected as localized, temporally variable acoustic responses along the fibre, demonstrating the sensitivity of DAS to gas-driven processes at the seabed. The integration of passive DAS monitoring with active seismic imaging techniques enabled a more robust interpretation of observed signals and seabed processes.

From an Earth sciences perspective, these results demonstrate that marine DAS can serve as a low-impact, spatially continuous monitoring tool for submarine volcanic and hydrothermal systems, complementing traditional geochemical sampling and visual observations and offering new insights into the temporal variability of degassing activity. Beyond natural systems, the demonstrated capability of DAS to detect bubble-related acoustic signals has direct implications for offshore CCS, where early detection of CO₂ leakage is critical for storage integrity and environmental safety.

Overall, this field-scale experiment highlights the potential of fibre-optic sensing to address key challenges in marine monitoring, and underscores the value of integrated approaches for studying fluid and gas migration processes.

Acknowledgements:

  • ECCSELLENT project (“Development of ECCSEL - R.I. ItaLian facilities: usEr access, services and loNg-Term sustainability”)
  • ITINERIS - Italian Integrated Environmental Research Infrastructures System - Next Generation EU Mission 4, Component 2 - CUP B53C22002150006 - Project IR0000032
  • Panarea NatLab Italy: https://eccsel.eu/catalogue/facility/?id=124
  • ECCSEL: https://eccsel.eu/

 

References:

  • Detection of CO2 emissions from Panarea seabed with Distributed Acoustic Sensing (DAS): a preliminary investigation. Meneghini et al. OGS report (2025).
  • Marine Fiber-Optic Distributed Acoustic Sensing (DAS) for Monitoring Natural CO₂ Emissions: A Case Study from Panarea (Aeolian Islands, Italy). Bellezza et al. Upon submission to Applied Sciences (2026).

How to cite: Bellezza, C., Meneghini, F., Travan, A., Baradello, L., Deponte, M., and Schleifer, A.: Marine Distributed Acoustic Sensing (DAS) for Detection of Submarine CO₂ Bubble Emissions: Insights from a Shallow-Water Volcanic Setting at Panarea (Italy), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7462, https://doi.org/10.5194/egusphere-egu26-7462, 2026.

EGU26-7987 | ECS | Orals | SM3.4

Urban-Scale Seismic Imaging Using Ambient Noise and Dark Fiber Distributed Acoustic Sensing in Istanbul 

Laura Pinzon-Rincon, Verónica Rodríguez Tribaldos, Jordi Jordi Gómez Jodar, Patricia Martínez-Garzón, Laura Hillmann, Recai Feyiz Kartal, Tuğbay Kılıç, Marco Bohnhoff, and Charlotte Krawczyk

Urban areas are highly vulnerable to the impacts of geohazards due to their dense populations and complex infrastructure, with potentially severe consequences for human life and economic stability. Improving our knowledge of near-surface and shallow subsurface structures in urban environments is therefore essential for effective seismic hazard assessment and risk mitigation. However, conventional geophysical surveys in cities are often limited by logistical constraints, including strong anthropogenic activity, restricted access, legal limitations, and risks associated with instrument deployment. In this context, repurposing existing telecommunication optical fibers (so-called dark fibers) as dense seismic sensing arrays using Distributed Acoustic Sensing (DAS) offers a powerful alternative for urban subsurface investigations. This approach enables continuous, high-resolution seismic monitoring without the need for extensive field instrumentation.

The megacity of Istanbul (Turkey) is located in one of the most tectonically active regions worldwide and is exposed to significant seismic hazard. Since May 2024, we have been continuously recording passive seismic data using Distributed Acoustic Sensing (DAS) along an amphibious fiber-optic cable, is deployed in the urban district of Kartal (eastern region of Istanbul) and immediately offshore. In this study, we focus on the 3 km-long urban segments of the fiber. We analyze ambient seismic noise generated by various anthropogenic sources, such as train and vehicle traffic and other urban activities, and evaluate their suitability for high-frequency, DAS-based passive seismic interferometry in a complex and heterogeneous urban setting.

We develop and adapt processing strategies for ambient-noise interferometry that address the challenges of dense urban environments and DAS array geometries, including the identification of suitable fiber sections, channels, and source-receiver configurations, as well as preprocessing schemes designed for strongly anthropogenic noise.The objective is to retrieve high-resolution, urban-scale subsurface velocity models that improve our understanding of shallow structures and material properties relevant to seismic hazard. Ultimately, this work aims to establish efficient methodologies for imaging the urban subsurface using existing infrastructure, contributing to improved geohazard assessment and supporting sustainable urban development in seismically active regions.

How to cite: Pinzon-Rincon, L., Rodríguez Tribaldos, V., Jordi Gómez Jodar, J., Martínez-Garzón, P., Hillmann, L., Feyiz Kartal, R., Kılıç, T., Bohnhoff, M., and Krawczyk, C.: Urban-Scale Seismic Imaging Using Ambient Noise and Dark Fiber Distributed Acoustic Sensing in Istanbul, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7987, https://doi.org/10.5194/egusphere-egu26-7987, 2026.

Applied to existing but underutilized fiber-optic networks (dark fibers), Distributed Acoustic Sensing (DAS) offers an attractive approach for large-scale seismic monitoring with minimal deployment effort. However, the approach introduces specific challenges, as existing infrastructures were not designed for this purpose, leading to constraints related to sensor coupling, heterogeneous installation conditions, and limited characterization of the measurement points. In the frame of the RUBADO project, we investigate the potential and limitations of DAS applied to dark fibers to provide seismic observations supporting both operational monitoring and characterization of deep geothermal reservoirs. The approach is implemented at multiple spatial scales within the Upper Rhine Graben, where several geothermal plants are currently operating, under development, or in the planning phase. In this context, research activities within the project specifically target key practical challenges related to the use of DAS on dark-fibers for the seismic monitoring of geothermal reservoirs.

Currently, data are recorded along a ~20 km fiber-optic line using the KIT infrastructure, which will support the monitoring of the drilling of a 1.4 km-deep geothermal well at KIT Campus North. We present early results from local and regional seismic monitoring and associated methodological approaches for signal enhancement and seismic event detection. We also introduce a framework for subsurface characterization that leverages the frequent vehicle-generated signals observed in the DAS recordings. We then outline planned measurements at the scale of the Upper Rhine Graben, where a key feature is the simultaneous use of multiple dark-fiber lines. Given the geometry of the planned dark-fiber network, DAS observations will enable the simultaneous monitoring of several geothermal sites with favorable spatial coverage.

How to cite: Azzola, J. and Gaucher, E.: Seismic monitoring of geothermal reservoirs using Distributed Acoustic Sensing on dark fibers: the RUBADO project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8212, https://doi.org/10.5194/egusphere-egu26-8212, 2026.

EGU26-8268 | ECS | Posters on site | SM3.4

Seismic monitoring of alpine lake ice with distributed acoustic sensing (DAS) and nodal arrays 

Ariana David, Cédric Schmelzbach, Thomas Hudson, John Clinton, Elisabetta Nanni, Pascal Edme, and Frederik Massin

Lake ice stability is critical for safe operations on mid- to high-altitude Alpine lakes, such as touristic activities. Existing lake-ice monitoring approaches like ground-penetrating radar and drilling are limited in their ability to resolve spatial variability and to enable continuous monitoring and require direct access to the ice for in situ measurements. Seismological methods offer a complementary approach by recording the wave field generated by lake-ice flexure and fracturing. Here, we assess Distributed Acoustic Sensing (DAS) as a long-term seismic monitoring tool for Alpine lakes.

During Winter 2025, we deployed two complementary seismic sensing systems on frozen Lake Sankt Moritz in the Swiss Alps: a fibre-optic network for DAS measurements and an array of over 40 three-component conventional autonomous seismic nodes to benchmark performance. We installed more than 2 km of fibre-optic cable and connected two interrogators that recorded, over a few weeks, strain and strain-rate data in two cores within the same cable.

To characterise ice properties and icequakes, we implemented workflows for automated icequake detection and location using the waveform-coherency based QuakeMigrate framework, which does not require phase picking, alongside an approach based on semi-automatic phase identification and picking. We successfully detected and located events with both types of instrument networks. Using a baseline catalogue from the three-component node data, we evaluated the DAS performance and achieved location agreement within a few metres between different sensing systems, demonstrating that DAS can robustly capture and localise icequake activity on lake ice and is a promising tool for continuous ice-stability monitoring.

How to cite: David, A., Schmelzbach, C., Hudson, T., Clinton, J., Nanni, E., Edme, P., and Massin, F.: Seismic monitoring of alpine lake ice with distributed acoustic sensing (DAS) and nodal arrays, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8268, https://doi.org/10.5194/egusphere-egu26-8268, 2026.

EGU26-8383 | ECS | Orals | SM3.4

Distributed acoustic sensing of very long period strain signals from strombolian explosions 

Francesco Biagioli, Eléonore Stutzmann, Pascal Bernard, Jean-Philippe Métaxian, Valérie Cayol, Giorgio Lacanna, Dario Delle Donne, Yann Capdeville, and Maurizio Ripepe

Very long period (VLP; 0.01-0.2 Hz) seismicity is observed at many volcanoes worldwide, and provides key insights into magma and fluid dynamics within volcanic structures. VLPs are typically recorded by sparse networks of seismometers, which limits the ability to resolve the resulting displacement (or deformation) at fine spatial scales. Distributed acoustic sensing (DAS) may help overcome this limitation by densely sampling the projection of the strain tensor along fibre-optic cables with high spatial and temporal resolution, enabling a more complete view of VLP-induced deformation. Here, we analyse VLP strain signals recorded by DAS at Stromboli volcano (Italy) in November 2022 along a 6-km dedicated fibre-optic cable. We designed the cable geometry to provide broad coverage of the craters and to sample the strain at multiple locations and along different directions. We focus on a dataset of approximately 200 VLP events recorded between November 13 and 14, 2022. The VLP strain signals correlate with explosive activity and show consistent features across multiple events, indicating a persistent, non-destructive source. Leveraging the distributed nature of DAS measurements, we recover the principal strain axes of VLPs and estimate both the location and the volumetric change of the source using a quasi-static deformation model. We retrieve the principal horizontal strains for each VLP by inverting strain amplitudes measured along three different fibre directions and at multiple locations along the cable, allowing us to resolve their spatial distribution. The resulting principal VLP strains exhibit radial and tangential orientations with respect to the craters, consistent with observed seismic particle motions and an axisymmetric source. We then model the VLP strain along the fibre using a point-like deformation source (Mogi). The optimal agreement between modeled and observed VLP strain averaged over the 200 events is for a point source located ~500 m beneath the active craters, with an estimated volumetric change of ~30 m³. Under the assumption of a spherical source with a radius of 87 m, the inferred volumetric change corresponds to a pressure change of ~19 kPa. These results are consistent with previous studies and highlight the capability of DAS to investigate volcano deformation at long periods.

How to cite: Biagioli, F., Stutzmann, E., Bernard, P., Métaxian, J.-P., Cayol, V., Lacanna, G., Delle Donne, D., Capdeville, Y., and Ripepe, M.: Distributed acoustic sensing of very long period strain signals from strombolian explosions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8383, https://doi.org/10.5194/egusphere-egu26-8383, 2026.

EGU26-8769 | ECS | Posters on site | SM3.4

Analyzing volcanic-like earthquakes with distributed acoustic sensing using a short segment of the Tongan seafloor telecommunications cable 

Shunsuke Nakao, Mie Ichihara, Masaru Nakano, Taaniela Kula, Rennie Vaiomounga, and Masanao Shinohara

The January 2022 eruption of the Hunga Tonga-Hunga Ha'apai (HTHH) volcano highlighted the critical challenges in monitoring remote submarine volcanic activity. Distributed Acoustic Sensing (DAS) utilizing existing seafloor telecommunications cables offers a promising solution to bridge this observational gap. We analyzed a one-week DAS dataset recorded in February 2023, approximately one year after the eruption, using a segment of a domestic telecommunication cable in Tonga.

While a previous analysis of this dataset focused on relatively large events with clear phases, our objective was to comprehensively detect small and unclear seismic signals to evaluate the post-eruption activity. We developed a new "duration-based" detection method that identifies temporally sustained energy increases in the array's median power, effectively suppressing spatially incoherent noise. This method successfully detected 770 discrete events, revealing a stable seismicity rate of approximately 110 events per day, significantly more than those detected by conventional triggering algorithms.

To distinguish the origin of these events, we estimated the apparent slowness of the signals using a robust method combining 2D Normalized Cross-Correlation and linear fitting (RANSAC). The results showed that most events have positive apparent slowness values, corresponding to arrivals from the direction of the HTHH volcano, rather than the negative apparent slowness corresponding to tectonic earthquakes from the Tongan Trench. These findings indicate that the HTHH volcano or its surrounding magmatic system maintained a high level of seismic activity even one year after the large 2022 eruption. This study demonstrates the capability of DAS to monitor subtle volcanic seismicity in submarine environments where traditional sensors are absent.

How to cite: Nakao, S., Ichihara, M., Nakano, M., Kula, T., Vaiomounga, R., and Shinohara, M.: Analyzing volcanic-like earthquakes with distributed acoustic sensing using a short segment of the Tongan seafloor telecommunications cable, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8769, https://doi.org/10.5194/egusphere-egu26-8769, 2026.

EGU26-9174 | ECS | Posters on site | SM3.4

Clustering of Large Distributed Acoustic Sensing Datasets 

Oliver Bölt, Conny Hammer, and Céline Hadziioannou

Distributed Acoustic Sensing (DAS) turns optical fibers into high resolution strain sensors by monitoring the scattering of light within the fiber. With channel distances in the order of a few meters and a typical sampling frequency of 1 kHz, DAS is capable of recording a wide range of natural and anthropogenic seismic signals. Furthermore, the optical fibers used for DAS can be several kilometers long and are suitable for long-term measurements over weeks, months or years. The datasets obtained by DAS can therefore be very large, with up to several terabytes of data per day. Due to this large amount of data, it is challenging to get a good overview of the different types of seismic signals contained in the data, since a manual inspection can become immensely time-consuming.

In this study we aim to automatize this process by clustering the data to detect and classify different types of seismic signals.  A two-dimensional windowed Fourier transform is used to automatically extract features from the data. In contrast to many other approaches, this allows to not only use temporal information, but to also include the spatial dimension to further distinguish between different seismic sources and wave types.

The clustering is performed in two steps. First, a Gaussian Mixture Model (GMM) is used to cluster the feature set. Then, the final clusters are obtained by merging similar components of the GMM.

A key advantage of this method is that each final cluster represents a specific frequency distribution and can therefore be turned into a filter. While many clustering approaches only assign a list of labels or cluster memberships to the data, our method provides the ability to directly extract the characteristic seismic signals for each cluster. This helps greatly with cluster interpretation and can also be useful for further applications like event detection or denoising.

The proposed procedure is applied to different large DAS datasets, yielding a variety of different clusters. By filtering the data for each cluster and interpreting the obtained waveforms, as well as the long-term spatiotemporal amplitude patterns, different sources like traffic or machinery can be identified.

How to cite: Bölt, O., Hammer, C., and Hadziioannou, C.: Clustering of Large Distributed Acoustic Sensing Datasets, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9174, https://doi.org/10.5194/egusphere-egu26-9174, 2026.

EGU26-10581 | ECS | Posters on site | SM3.4

Urban Seismology of a Popular Road Race Using Distributed Acoustic Sensing 

Jorge Canudo, Diego Gella, Pascual Sevillano, and Javier Preciado-Garbayo

Distributed Acoustic Sensing (DAS) has emerged as a powerful tool for monitoring human-induced seismic signals in urban environments, enabling dense, meter-scale observations of dynamic sources. Building on previous studies demonstrating the capability of DAS to image large public events, such as parades and other mass-participation activities, we present a novel experiment in which two different DAS technologies (ΦOTDR and Chirped-Pulse ΦOTDR) were simultaneously deployed to record a popular pedestrian road race held in the surroundings of the University of Zaragoza (Spain).

The experiment took advantage of an already deployed optical-fiber installation with a total effective length of approximately 2 km. The fiber layout captured three distinct geometrical configurations with respect to the race course: (1) a straight section coincident with the runners’ trajectory over the last 300 m of the first kilometer (outbound leg), (2) the same straight section during the return at kilometer 4 (inbound leg), and (3) a perpendicular crossing of the fiber with the race course at the finish line. This geometry provides a unique opportunity to analyze runner-induced ground vibrations under varying crowd densities, running speeds, and fiber–source orientations.

Waterfall representations of the strain-rate data reveal clear, coherent signatures associated with individual runners and runner groups in both DAS systems. Along the straight section, the outbound leg exhibits a compact, high-amplitude wavefield characterized by closely spaced, overlapping runner traces, consistent with the tightly packed peloton early in the race. In contrast, the inbound leg shows a markedly more dispersed pattern, reflecting the progressive spreading of participants according to performance and fatigue. These differences are consistently observed in both phase-based and chirped-pulse DAS data, although with distinct signal-to-noise characteristics across different frequency bands.

At the finish line, where the fiber crosses the race course perpendicularly, the DAS records provide exceptional temporal resolution of runner arrivals. The first five finishers are individually and unambiguously identified, with isolated signatures that can be robustly matched to official arrival times. This demonstrates the potential of DAS not only for bulk crowd characterization but also for resolving individual human-induced seismic sources in real-world conditions.

Our results highlight the complementarity of DAS technologies for urban seismology applications. The experiment underscores the sensitivity of DAS to subtle variations in crowd dynamics and source geometry and illustrates its potential for non-intrusive monitoring of mass-participation events, pedestrian flows, and urban activity. These observations contribute to the growing field of anthropogenic seismology and reinforce the role of optical fiber sensing as a scalable tool for high-resolution monitoring of human activity in cities.

How to cite: Canudo, J., Gella, D., Sevillano, P., and Preciado-Garbayo, J.: Urban Seismology of a Popular Road Race Using Distributed Acoustic Sensing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10581, https://doi.org/10.5194/egusphere-egu26-10581, 2026.

EGU26-10676 | Orals | SM3.4

Storm Amy observations with fibre-optic DAS data at the Svelvik CO₂ Field Lab, Norway: Implications for Monitoring and Networks  

Claudia Pavez Orrego, Marcin Duda, Dias Urozayev, Bastien Dupuy, and Nicolas Barbosa

Distributed Acoustic Sensing (DAS) has become a powerful technique for high-resolution, continuous monitoring of near- and subsurface earth phenomena, with increasing applications in geohazards, seismology, and industry applications such as CO₂ storage monitoring. However, the sensitivity of DAS measurements to atmospheric forcing, particularly during extreme weather events, remains poorly understood. In this study, we investigate the response of a permanent, 1.2 km long straight fibre-optic array installed at the Svelvik CO₂ Field Laboratory (Norway), to intense wind conditions associated with the Amy Storm, which hit Norway from October 3-6, 2025. 

 

As part of efforts to understand passive methods to monitor CO2 migration in the subsurface, an Alcatel Submarine Networks (ASN) DAS system continuously recorded strain-rate data along a buried fibre that includes both near surface-installed sections and borehole down- and up-going segments reaching depths of approximately 100 m. The near-surface sections were installed inside protective pipes and were therefore not directly coupled to the surrounding ground. To characterise wind-induced seismic signatures, we analyse downsampled recordings using band-limited root-mean-square (RMS) amplitudes and spectral methods across three frequency ranges (0.1–1 Hz, 1–3 Hz, and 3–10 Hz) and time averages over 1 hr intervals. Time–frequency characteristics are examined using group-averaged spectrograms, and a Spectral Energy Index (SEI) is derived by integrating power spectral density within each frequency band. These seismic metrics are compared with near located meteorological observations, including mean wind speed, maximum mean wind speed, and maximum wind gusts. 

 

The results reveal a pronounced increase in DAS energy coincident with the maximum speed gusts of storm Amy, with the strongest responses observed at frequencies below 3 Hz. Correlation and lag analyses show that seismic energy variations are closely associated with periods of enhanced wind activity, particularly wind gusts, indicating a strong coupling between transient atmospheric forcing and ground vibrations. Importantly, the response differs significantly between surface and depth segments of the fibre. Surface-installed channels exhibit broadband amplitude increases correlated with direct wind–ground interaction, while depth channels display coherent low-frequency spectral patterns, suggesting excitation by wind-generated surface waves or distant secondary sources (e.g., waves from neighbouring fjord) rather than direct aerodynamic loading. 

 

These findings demonstrate that DAS arrays deployed at wells (abandoned or active) are sensitive to extreme meteorological forcing, which can imprint distinct and depth-dependent seismic signatures. Quantifying and distinguishing wind-induced signals is therefore critical for the robust interpretation of DAS data in long-term passive monitoring applications, particularly when subtle subsurface signals related to CO₂ injection, migration, or leakage must be detected in the presence of strong environmental noise. At the same time, this sensitivity highlights an additional benefit of such fibre-optic installations: DAS infrastructure deployed in future abandoned wells in the context of  Oil & Gas industry and their reutilization for CO2 capture and storage, can also provide valuable information for national seismic and environmental monitoring networks, extending their utility beyond site-specific applications. 

How to cite: Pavez Orrego, C., Duda, M., Urozayev, D., Dupuy, B., and Barbosa, N.: Storm Amy observations with fibre-optic DAS data at the Svelvik CO₂ Field Lab, Norway: Implications for Monitoring and Networks , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10676, https://doi.org/10.5194/egusphere-egu26-10676, 2026.

EGU26-10839 | ECS | Posters on site | SM3.4

Fibre sensing at regional scales with telecom cables: the IMAGFib project 

Nicolas Luca Celli, Chris Bean, Adonis Bogris, Georgios Aias Karydis, Eoin Kenny, Rosa Vergara, Örn Jónsson, and Marco Ruffini

Fibre sensing technology can provide seismic data at a variety of scales, but, currently, the difficulty in accessing long telecom fibres, together with the novelty of the instruments, their range limitations and massive data output, mostly constrain its applications to fibre <100 km long.

In this study, we showcase the first results from the new project IMAGFib (multiscale seismic IMAGing with optical FlBre telecom cables), acquiring on-/offshore fibre sensing data on commercial telecom fibres in the North Atlantic Ocean, Irish Sea and across Ireland. This project combines utilising Distributed Strain Sensing (DSS, also known as DAS) on >400 km with 10 m spatial sampling with a new, distributed Microwave Frequency Fiber Interferometer (MFFI) capable sensing over 1700 km of submarine cables connecting Ireland to Iceland, albeit with a coarser 50-100 km spatial sampling. We use the acquired data to assess the performance of fibre sensing as a regional-to-continental scale seismic and ocean monitoring, and a future imaging tool, with a focus on low frequencies (<1 Hz).

By forging research collaborations with multiple telecom operators, we are able to perform DSS on multiple cable sections across the region, aiming to cover a continuous linear profile from Wales to the North Atlantic through different experiments (to be completed early 2026), part of which is performed on live, traffic-carrying telecom fibres. Our DSS results show that while having lower signal to noise ratios compared to nearby seismic stations, DSS on noisy telecom fibres can successfully record most Mw>6 teleseismic events worldwide, as well as microseisms originating in the North Atlantic and/or Irish Sea on all sections of the cable.

In order to extend fibre sensing far into the North Atlantic Ocean, we present the newly developed MFFI sensor, which uses optical interferometry in conjunction with high-loss loop backs at line amplifiers, turning each section of the cable between amplifiers (50-100 km) into independent strain sensors. For our experiment on the Ireland-Iceland cable, this yields 17 traces along the fibre. Ongoing recording in late 2025-early 2026 allows us to evaluate its capability to sense seismic signals, marine storms, currents and possibly ocean-bottom temperature variations across seasons.

With a strong focus on long-range and low-frequency sensing and integration with live telecom infrastructure, IMAGFib is centred on the establishment of fibre sensing as a global geo-sensing tool. Our successful results using DSS on live telecom fibres, and developing MFFI technology using affordable off-the-shelf components represent a key step in advancing the efforts to broaden trusted research utilising existing, commercial telecom cables.

How to cite: Celli, N. L., Bean, C., Bogris, A., Karydis, G. A., Kenny, E., Vergara, R., Jónsson, Ö., and Ruffini, M.: Fibre sensing at regional scales with telecom cables: the IMAGFib project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10839, https://doi.org/10.5194/egusphere-egu26-10839, 2026.

EGU26-11265 | ECS | Posters on site | SM3.4

SmartScape: Distributed Strain Sensing on Dublin City Telecom Fibre to Monitor Urban and Subsurface Dynamics for Smart City Applications 

Bruna Chagas de Melo, Christopher J. Bean, and Colm Browning

Rapid urban growth in Dublin is placing increasing pressure on transport systems, construction activity, and environmental management, creating a clear need for high-resolution observations of how the city operates at both surface and subsurface levels. This study presents the initial stage of a new project that explores the feasibility of using existing optical telecommunication infrastructure as a large-scale urban sensing platform through Distributed Strain Sensing (DSS). DSS converts optical fibres into dense seismic arrays by measuring strain-rate perturbations caused by ground vibrations, offering a cost-efficient approach to city-scale monitoring. This can have a potentially transformative impact on smart and sustainable city management, offering new data insights into urban dynamics while leveraging existing city-owned fibre infrastructure.

We report on a first pilot deployment on a dark ~80 km fibre ring crossing the city centre, residential neighbourhoods, surface tram lines, and an underground tunnel. A FEBUS-A1 interrogator was installed at a data centre in Dublin’s north side and operated for 23 days. Several acquisition configurations were tested, with the most stable setup recording ~60 km of fibre at 500 Hz sampling and 20 m gauge length for a continuous 10-day period. Remote access enabled iterative optimisation of acquisition parameters during the experiment.

The analysis presented here is preliminary and focuses on assessing data quality, signal content, and key technical limitations. Initial observations indicate that the DSS array captures clear signatures of moving vehicles with different velocities, rail-related activity, and teleseismic signals, including the October 10th M7.4 Mindanao, Philippines event. Signal quality progressively degrades beyond ~30 km from the interrogator, where noise becomes dominant, highlighting challenges associated with attenuation, coupling, and urban noise in long fibre links.

Ongoing work focuses on developing denoising and source-identification strategies, including cross-correlation approaches and unsupervised machine-learning, alongside accurate georeferencing of fibre channels onto detailed urban maps. These analyses will be integrated with independent datasets such as traffic records from Dublin City Council and existing environmental acoustic noise maps. Rather than delivering operational products, this study is intended to establish a robust baseline on data quality, signal content, and interpretability, defining what information can realistically be extracted from urban DSS deployments in Dublin at this early stage.

How to cite: Chagas de Melo, B., J. Bean, C., and Browning, C.: SmartScape: Distributed Strain Sensing on Dublin City Telecom Fibre to Monitor Urban and Subsurface Dynamics for Smart City Applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11265, https://doi.org/10.5194/egusphere-egu26-11265, 2026.

EGU26-11391 | Posters on site | SM3.4

Integrating Distributed Acoustic Sensing and borehole seismometer data for seismic velocity measurements and negative magnitude event location: a case study from the TABOO Near Fault Observatory (Northern Apennines, Italy) 

Nicola Piana Agostinetti, Federica Riva, Irene Molinari, Simone Salimbeni, Alberto Villa, Marta Arcangeli, Giulio Poggiali, Raffaello Pegna, Gilberto Saccorotti, Gaetano Festa, and Lauro Chiaraluce

Distributed Acoustic Sensing (DAS) technology makes use of fiber optic cables to sense vibrations, at the Earth’s surface, at unprecedented spatial resolution, less than one meter over distances of kilometres. DAS data have been used for monitoring both the Solid Earth (earthquakes, dyke intrusions and more) and the environment (landslides, snow avalanches, groundwater). Despite its wide application and the numerous, successful case-studies, DAS technology presents two significant limitations: the lower S/N ratio with respect to standard seismometers and the strong "directivity effect" (vibrations must propagate in the axial direction of the fiber optic cable). In this study, we illustrate how the integration of DAS and borehole seismometer data can be used to improve earthquake location and obtain novel information on seismic velocity of the buried rock mass. We analyse the DAS data recorded along a 1km fiber optic cable deployed in a full 3D geometry. The fiber optic cables have been installed in the framework of a surface and borehole very dense seismic array partaining to the Alto Tiberina Near Fault Observatory (TABOO-NFO). The cable geometry covers two horizontal planes, off-set one from the other and at different altitudes, and a vertical borehole  going to 130m depth. The infrastructure has been installed across (from the hangingwal to the footwall) the Gubbio fault, a secondary fault segment antithetic to the main Alto Tiberina master fault bounding at depth a normal fault system. in the Alto Tiberina fault system (Northern Apennines, Italy). The center of the cable array coincides with a shallow borehole (130m deep)  instrumented with two short period seismometers, one at the surface and one at the bottom. The integration of the data from the seismometes and those recorded along such 3D geometry allows for a better recognition and location of very small seismic events occurring on the fault, which are going largely undetected by the local (dense) seismic network. Moreover, data from small size events (Mag > 1) can be used to estimate the P- and S- wave seismic velocity of the geological formation traversed by the borehole (namely, Maiolica fm and Marne a Fucoidi fm), defining precise measurements of such velocities at larger scale-length (10s of meters) with respect to measurements obtained on the same rock in the laboratory.

How to cite: Piana Agostinetti, N., Riva, F., Molinari, I., Salimbeni, S., Villa, A., Arcangeli, M., Poggiali, G., Pegna, R., Saccorotti, G., Festa, G., and Chiaraluce, L.: Integrating Distributed Acoustic Sensing and borehole seismometer data for seismic velocity measurements and negative magnitude event location: a case study from the TABOO Near Fault Observatory (Northern Apennines, Italy), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11391, https://doi.org/10.5194/egusphere-egu26-11391, 2026.

EGU26-11798 | ECS | Posters on site | SM3.4

Distributed Acoustic Sensing of debris-flow activity in the Öschibach torrent (Swiss Alps) 

Juan Sebastian Osorno Bolivar, Malgorzata Chmiel, Fabian Walter, Felix Blumenschein, and Kevin Friedli

The slope instability of Spitze Stei supplies large sediment volumes that accumulate at the slope toe and are subsequently remobilized as debris flows and debris floods in the adjacent Öschibach torrent thus threatening the nearby village of Kandersteg, Switzerland. Since early 2020, continuous monitoring and preventive measures have been implemented in the area. While long-term monitoring has documented frequent torrential activity, the dynamic linkage between sediment supply from the rock slope and debris-flow activity in the torrent remains poorly constrained due to the spatial limitations of point sensors.

In summer 2025, we deployed a dense seismic array on the rock slope and interrogated an existing dark optical fiber running along the ~4 km-long Öschibach torrent using Distributed Acoustic Sensing (DAS). The DAS setup enabled spatially continuous strain-rate measurements at meter-scale resolution with a sampling frequency of ~600 Hz. For the three-month acquisition period, our aim is to detect and characterize debris-flow and debris-flood activity using DAS methods, supported by relative water-level time series and data from nearby seismic stations.

A catalog of possible debris flows and debris floods is generated leveraging an established pre-warning water-level increase threshold (set at 0.6 m), using moving average windowing and duration filtering. This discharge inventory was characterized using the DAS array, whose ~850 channels have been geolocalized with tap test, based on strain rate amplitudes visualized in logarithmic waterfall plots. Analysis of Power Spectral Density (PSD) for the corresponding DAS recordings reveals an increase in seismic energy at high frequencies (~20-40 Hz) concentrated on channels closest to the stream. Vertically offset waveform comparison plots demonstrate high coherence between DAS channels and wavefields recorded at the seismic stations, from which the apparent speed of seismic sources can be estimated. We also observe other coherent signals along the fiber, including mass movements from the Spitze Stei rock slope (e.g., rockfalls and granular flows), as well as local and tele-seismic earthquakes.

Our assessment of signal quality and coherence provides a basis for subsequent event detection, source location, and characterization using array-based methods, particularly during the event initiation phase. Our multisensor approach highlights the potential of DAS to provide spatially dense observations of torrential processes in steep Alpine catchments.

How to cite: Osorno Bolivar, J. S., Chmiel, M., Walter, F., Blumenschein, F., and Friedli, K.: Distributed Acoustic Sensing of debris-flow activity in the Öschibach torrent (Swiss Alps), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11798, https://doi.org/10.5194/egusphere-egu26-11798, 2026.

EGU26-12160 | ECS | Orals | SM3.4

Best Practices for Machine Learning based Icequake Picking with Distributed Acoustic Sensing 

Johanna Zitt, Marius Isken, Jannes Münchmeyer, Dominik Gräff, Andreas Fichtner, Fabian Walter, and Josefine Umlauft

Over the past years, a wide range of machine learning–based phase picking methods have been developed, primarily targeting three-component seismometer data from tectonic earthquakes. With the rapid growth of distributed acoustic sensing (DAS) applications, diversification of use cases, and availability of increasingly large DAS datasets, these methods are now being applied to single-component DAS recordings. However, their optimal use for DAS data and for alternative signal types such as cryoseismological events, remains rarely explored.
In this study, we present a systematic analysis of the performance of machine learning–based phase picking methods pretrained on tectonic earthquakes on one-component cryoseismological DAS data obtained on the Rhône Glacier in the Swiss Alps in July 2020. We evaluate multiple strategies for generating pseudo-three-component data from the intrinsically single-component DAS strain-rate data, including zero-padding of missing components, duplication of the single component, and the use of consecutive DAS channels as surrogate components. In addition, we assess the phase-picking performance across different preprocessing schemes, comparing conservatively band-pass filtered data with denoised data obtained using a J-invariant  autoencoder specifically trained on cryoseismological DAS data. Finally, we analyze the spatial and temporal distribution of located events over the full observation period and across the entire glacier. Event clusters are correlated with weather conditions, daily cycles, and the geometry of the glacier bed to explore potential patterns in cryoseismic activity.
Our results indicate that treating consecutive DAS channels as surrogate components yields the most reliable phase-picking performance, whereas extensive denoising can degrade picking accuracy. We further discuss spatial clusters of event locations and their correlations with glacier topography and meteorological conditions.

How to cite: Zitt, J., Isken, M., Münchmeyer, J., Gräff, D., Fichtner, A., Walter, F., and Umlauft, J.: Best Practices for Machine Learning based Icequake Picking with Distributed Acoustic Sensing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12160, https://doi.org/10.5194/egusphere-egu26-12160, 2026.

EGU26-12365 | ECS | Posters on site | SM3.4

Distributed Acoustic Sensing (DAS) for Geothermal Applications: a Case Study Across Dublin City 

Eoghan Totten, Jean Baptiste Tary, and Bruna Chagas de Melo

Seismic monitoring plays an integral role in geothermal renewable energy projects for imaging, site-specific noise characterisation and hazard risk assessment purposes. The number of European geothermal energy projects is expected to rise over the next decade as efforts to mitigate for reliance on fossil fuel-derived energy sources continue. Related to this is the pressing need to prospect for and expand the use of geothermal energy in urban settings.

Distributed Acoustic Sensing (DAS) is increasingly applied in lieu of geophone-based deployments. Instead of measuring seismic waves at a limited number of discrete points, DAS transforms fibre-optic cables into large and dense arrays of virtual sensors by measuring small changes in strain rate, with gauge length resolutions as small as 1-20 metres. DAS interferometry is able to capitalise on extant urban fibre-optic infrastructure, as well as exploit the diverse and passive seismic noise sources available in towns and cities.

Here we present in-progress DAS data analysis from an approximately 70-80km long cable crossing Dublin city (south to north) for three weeks of cumulative recording between September-October 2025. This cable tracks a large portion of the M50 ring road, the main arterial traffic route between north and south Dublin. We identify and characterise the main noise sources as a function of space and time, comparing DAS signals with temporally overlapping broadband seismometer data. We discuss possible approaches to suppress incoherent noise along the cable for future shallow and deep geothermal monitoring, as well as imaging applications using coherent noise.

This research feeds into the European Union-funded Clean Energy Transition partnership project, GEOTWINS, which seeks to extend the state-of-the-art in modular geothermal digital twins, for improved deep geothermal imaging methodologies, drilling risk mitigation and to progress societal acceptance.

How to cite: Totten, E., Tary, J. B., and Chagas de Melo, B.: Distributed Acoustic Sensing (DAS) for Geothermal Applications: a Case Study Across Dublin City, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12365, https://doi.org/10.5194/egusphere-egu26-12365, 2026.

EGU26-12403 | Posters on site | SM3.4

Railway Distributed Acoustic Sensing data as an aid to earthquake monitoring in northernmost Sweden 

Björn Lund, Matti Rantatalo, Myrto Papadopoulou, Michael Roth, and Gunnar Eggertsson

The Swedish Transport Administration (STA) currently monitors the railway between Kiruna and the Swedish-Norwegian border with Distributed Acoustic Sensing (DAS), a distance of approximately 130 km. In collaboration with STA and Luleå University of Technology, the Swedish National Seismic Network (SNSN) has established data transmission on a request basis from the interrogator. As the railway crosses the Pärvie fault, the largest known, and still very active, glacially triggered fault, we hope to significantly improve detection and analysis of small earthquakes on that section of the fault. In this presentation we will show how we define low noise sections of the cable, using local and teleseismic events, and then use these as individual seismic stations. Over the 130 km, as the railway winds its way across the mountains, the cable generally runs in directions from N-S via NW-SE to W-E, providing many possible incidence directions. We discuss the technicalities of the data sharing, the existing metadata problems, how the DAS data is analyzed and incorporated into the routine processing at SNSN.

How to cite: Lund, B., Rantatalo, M., Papadopoulou, M., Roth, M., and Eggertsson, G.: Railway Distributed Acoustic Sensing data as an aid to earthquake monitoring in northernmost Sweden, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12403, https://doi.org/10.5194/egusphere-egu26-12403, 2026.

EGU26-12609 | ECS | Orals | SM3.4

Understanding fiber optic sensitivity to a wavefield: A framework to separate site amplification from orientation effects 

Olivier Fontaine, Andreas Fichtner, Thomas Hudson, Thomas Lecocq, and Corentin Caudron

Interpreting amplitudes in Distributed Acoustic Sensing (DAS) data is challenging because the recorded signal is influenced by multiple factors.

To differentiate the impact of fiber orientation from site effects, we develop expressions of axial strain for different body wave polarizations. These expressions consider a linear fiber segment with any orientation in space. From these we explore array geometry properties and the potential of the DAS transfer function as a polarization filter. This last property arises from the polarity inversion characteristic of shear waves and the averaging nature of the gauge length. If the gauge length is set to be a loop instead of a linear segment then the DAS will average all azimuth for a horizontal loop, canceling SH waves. For a vertical loop, all dips are averaged canceling SV waves traveling within the loop plane. These results could reflect a link between DAS and rotational seismology. 

From these transfers functions, we develop a low-cost forward model based on ray theory that predicts amplitude recorded in a DAS array. Differences in amplitude between the modeled and observed wavefields relate to local site amplification from which, we create an amplitude correction factor. We evaluated this method using active seismic experiments from the PoroTomo dataset, successfully identifying regions with anomalous high amplitude responses consistent with the recordings following a magnitude 4.3. 

The results, together with the main elements of our approach, are transferable in many new sensing strategies, including optimization of fiber deployment geometry, generations of synthetic data and the acceleration and improvement of existing location methods through DAS-specific amplitude and phase corrections.
In summary, by exploiting the known directional sensitivity of DAS, we draw new insights from amplitude variations along the fiber array, treating energy loss as equally informative as energy gain in interpreting the wavefield. 

How to cite: Fontaine, O., Fichtner, A., Hudson, T., Lecocq, T., and Caudron, C.: Understanding fiber optic sensitivity to a wavefield: A framework to separate site amplification from orientation effects, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12609, https://doi.org/10.5194/egusphere-egu26-12609, 2026.

EGU26-12675 | ECS | Orals | SM3.4

Strategies and Challenges in Applications of DAS-based Earthquake Early Warning Systems 

Claudio Strumia, Gaetano Festa, Alister Trabattoni, Diane Rivet, Luca Elia, Francesco Carotenuto, Simona Colombelli, Antonio Scala, Francesco Scotto di Uccio, and Anjali Suresh

Distributed Acoustic Sensing (DAS) transforms fiber-optic cables into ultra-dense strainmeter arrays, providing spatially and temporally continuous earthquake recordings. While its potential for offline seismic characterization is increasingly recognized, a key application of this sensing paradigm is real-time monitoring for Earthquake Early Warning (EEW). The use of existing fiber-optic infrastructures allows for sensing cables located close to seismogenic sources, such as offshore subduction zones, potentially extending the lead time of issued alerts. DAS deployments within Near Fault Observatories further provide dense spatial coverage of epicentral areas, favouring the rapid extraction of robust source information.

The application of DAS to EEW – alone or as a complement to standard accelerometers - has been recently explored, specifically focusing on the estimate of earthquake magnitude from the first seconds of recorded data. Existing approaches rely either on conversion strategies to ground-motion proxies or on direct analysis in the strain-rate domain. However, both the robustness of different conversion strategies and the selection of the most informative physical quantity for early magnitude estimation are not yet consolidated. In offshore environments, additional complexity arises from fiber-optic cables deployed on sediments, where strong converted phases often dominate early waveforms and hinder the direct P-wave signal traditionally used for EEW.

In this work, we analyse earthquakes recorded by the ABYSS network, supported by the ERC – starting program, consisting of 450 km of offshore telecommunication cables deployed along the Chilean subduction trench and interrogated by three DAS units. At this high-seismicity testbed, we develop a strategy for fast magnitude estimation with DAS. We show that converted Ps phases preceding S-wave arrivals carry significant information on earthquake magnitude. Furthermore, we investigated whether the use of time and space-integrated observables on DAS recordings can enhance the predictive power of amplitudes from the first seconds of seismic signals.

Finally, we assess the performance of a DAS-based EEW, grounded on the software PRESTo (Satriano et al., 2011). Using moderate-to-large offshore Chilean earthquakes, we highlight potential and limitations of DAS in regions with sparse conventional instrumentation. Complementary analyses using data from the Irpinia Near Fault Observatory demonstrate the benefits of jointly exploiting DAS and traditional seismic stations within dense monitoring networks, confirming the applicability of DAS-based EEW systems across different tectonic settings.

How to cite: Strumia, C., Festa, G., Trabattoni, A., Rivet, D., Elia, L., Carotenuto, F., Colombelli, S., Scala, A., Scotto di Uccio, F., and Suresh, A.: Strategies and Challenges in Applications of DAS-based Earthquake Early Warning Systems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12675, https://doi.org/10.5194/egusphere-egu26-12675, 2026.

EGU26-13083 | ECS | Orals | SM3.4

Long range Coherent-Optical Frequency Domain Reflectometry for large scale distributed sensing 

Debanjan Show, Biplab Dutta, Maël Abdelhak, Olivier Lopez, Adèle Hilico, Anne Amy-Klein, Christian Chardonnet, Paul-Eric Pottie, and Etienne Cantin

Fig. 1: Map of the REFIMEVE network (green links) and its connection to European links.

In recent years, significant technological progress has demonstrated the feasibility of using the long distance fiber optic links as large scale distributed networks for environmental sensing [1]. Optical fibers are inherently sensitive to external perturbations: their mechanical structure responds to strain, while the light propagating within them undergoes measurable intensity and phase variation when subjected to vibration or seismic waves. A notable example is the French national research infrastructure REFIMEVE [2], which distributes ultrastable time and frequency references across more than 9000 km of fiber links connecting laboratories throughout France and Europe (see Fig. 1). The infrastructure has demonstrated strong potential for geophysical studies [3]. Applications such as earthquake detection, volcano monitoring, and environmental hazard surveillance are attracting increasing interest worldwide, particularly because they can leverage already existing fiber networks. In this context, the European project SENSEI (Smart European Networks for Sensing the Environment and Internet Quality) [4] aims to harness this potential by developing the next generation photonic technologies for detecting both natural phenomena, such as earthquakes, volcano activity, and anthropogenic events including construction activity or vehicular traffic.

Within this framework, one of our objectives is to develop a coherent optical frequency domain reflectometry (C-OFDR) [5]. Current systems are limited to approximately 100 km by the coherence length of the laser source.  Here, we take benefit from the low frequency noise laser source generated by REFIMEVE frequency reference in order to extend the sensing range. In our setup, the output of a low noise laser is frequency modulated and a fiber under test is studied in a Michelson interferometer configuration. By analyzing the Rayleigh backscattered signal along the fiber, the system enables detailed diagnostics of the fiber under test including the detection of localized fiber deformations, faulty connectors, attenuation variations, and disturbances induced by environmental vibrations. As a first demonstration, we tested a prototype over a long range fiber link made of laboratory spools extending up to 335 km. The system successfully identified the position of the optical amplifier and a PC connector placed at the end of the fiber with km scale spatial resolution. In addition, vibration induced perturbation was observed and is under study, highlighting the potential of this technique for seismic applications. In future work, we plan to deploy the C-OFDR system on the operational REFIMEVE fiber network to evaluate its performance under real field conditions. This approach positions C-OFDR as a powerful tool for telecommunication infrastructure monitoring and distributed geophysical sensing.  

References :

[1] G. Marra et al., Science 361 (2018), https://doi.org/10.1126/science.aat4458

[2] REFIMEVE, https://www.refimeve.fr/en/homepage/

[3] M. B. K. Tønnes, PhD Thesis (2022), https://hal.science/tel-03984045v1

[4] SENSEI, https://senseiproject.eu/

[5] C. Liang et al., IEEE Access. 9 (2021), DOI: 10.1109/ACCESS.2021.3061250

How to cite: Show, D., Dutta, B., Abdelhak, M., Lopez, O., Hilico, A., Amy-Klein, A., Chardonnet, C., Pottie, P.-E., and Cantin, E.: Long range Coherent-Optical Frequency Domain Reflectometry for large scale distributed sensing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13083, https://doi.org/10.5194/egusphere-egu26-13083, 2026.

EGU26-13151 | Orals | SM3.4

Fiber optic cables (DAS) for seismic event detection – An underground case study 

Vincent Brémaud and Colin Madelaine

Distributed Acoustic Sensing (DAS), leveraging existing fiber optic infrastructure, represents a groundbreaking advancement in seismic monitoring. By converting telecommunication cables into dense arrays of virtual sensors, DAS enables continuous spatial coverage and enhanced sensitivity to seismic waves in remote or logistically constrained environments. This capability positions DAS as a complementary or alternative tool to traditional seismic networks, offering cost-effective, low-maintenance solutions for geophysical research and hazard monitoring.

This study focuses on the Premise-2 experiment, conducted at the Low-Noise Underground Laboratory (https://www.lsbb.eu/) in Rustrel, France, a site renowned for its low seismic noise. The experiment integrates active and passive seismic acquisitions, capturing both ambient noise and controlled seismic signals to assess DAS’s ability to detect and characterize events. Multiple fiber optic cable types and installation methods (laid on the ground, with sand bags, buried, or structurally attached) are evaluated to determine their impact on signal sensitivity, spatial resolution, and measurement robustness.

This study provides critical insights into optimal DAS deployment configurations for seismological applications while highlighting the challenges posed by large-scale data acquisition. The research underscores the need for advanced algorithms and specific workflows to fully exploit DAS’s potential. To characterized the events, we have used a workflow using automatic P and S arrival phases. We filtered these arrivals with an associator to select only detections that could be linked to an event. Then we tried different location algorithms to get a complete workflow from the acquisition to the location of the events.

How to cite: Brémaud, V. and Madelaine, C.: Fiber optic cables (DAS) for seismic event detection – An underground case study, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13151, https://doi.org/10.5194/egusphere-egu26-13151, 2026.

EGU26-13235 | ECS | Orals | SM3.4

Distributed Acoustic Sensing at the Engineering Scale: Experimental Insights from the PITOP Test Site 

Olga Nesterova, Luca Schenato, Alexis Constantinou, Thurian Le Dû, Fabio Meneghini, Andrea Travan, Cinzia Bellezza, Gwenola Michaud, Andrea Marzona, Alessandro Brovelli, Silvia Zampato, Giorgio Cassiani, Jacopo Boaga, and Ilaria Barone

The PITOP geophysical test site, operated by the Istituto Nazionale di Oceanografia e di Geofisica Sperimentale (OGS) in north-eastern Italy, provides a unique experimental environment for testing seismic acquisition technologies under realistic field conditions. Covering ~22,000 m², PITOP was established to support the development and validation of geophysical methods and instrumentation in both surface and borehole installations. Here, we evaluate PITOP’s potential for Distributed Acoustic Sensing (DAS) experiments, focusing on small-scale seismic measurements relevant to urban settings and engineering applications. 

Five boreholes with distinct purposes and instrumentation are available at the PITOP site, including a water well (PITOP1), two 400-m-deep wells associated with geosteering research (PITOP2 and PITOP3), a 150-m-deep borehole permanently equipped with optical fibre for DAS measurements (PITOP4), and a recently drilled well dedicated to geoelectrical surveys (PITOP5). The site also hosts a surface-deployed fibre-optic cable, containing both linear and helicoidal fibers, and about 20 3C seismic nodes. Finally, several seismic sources are available, which are a borehole Sparker Pulse, suitable for crosshole VSP configurations, and two surface vibratory sources, the IVI MiniVib T-2500, which can generate sweeps in the 10–550 Hz frequency range, and the ElViS VII vibrator, designed for frequencies between 20 and 220 Hz.

We conducted three dedicated experiments:  (i) cross-hole measurements with sources in PITOP3 at depths of 10, 50, 75, and 100 m, and DAS recording in PITOP4; (ii) a vertical seismic profiling (VSP) survey using the MiniVib source close to the well head with DAS recording in PITOP4; and  (iii) recordings of the seismic wavefield generated by P- and S-wave vibratory sources using surface DAS arrays in linear and helicoidal configurations, together with co-located 3D geophones for comparison.

DAS data were acquired with multiple gauge lengths and acquisition settings. The resulting datasets enable a systematic evaluation of acquisition parameters selection and highlight processing strategies required for different DAS configurations. They provide a valuable basis for assessing optimal DAS acquisition strategies for small-scale seismic applications and for defining processing workflows adapted to diverse source and receiver geometries.

The present study is being carried out within the framework of the USES2 project, which receives funding from the EUROPEAN RESEARCH EXECUTIVE AGENCY (REA) under the Marie Skłodowska-Curie grant agreement No 101072599.

This research has been supported by the Interdepartmental Research Center for Cultural Heritage CIBA (University of Padova) with the World Class Research Infrastructure (WCRI) SYCURI—SYnergic strategies for CUltural heritage at RIsk, funded by the University of Padova.

How to cite: Nesterova, O., Schenato, L., Constantinou, A., Le Dû, T., Meneghini, F., Travan, A., Bellezza, C., Michaud, G., Marzona, A., Brovelli, A., Zampato, S., Cassiani, G., Boaga, J., and Barone, I.: Distributed Acoustic Sensing at the Engineering Scale: Experimental Insights from the PITOP Test Site, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13235, https://doi.org/10.5194/egusphere-egu26-13235, 2026.

EGU26-13315 | ECS | Orals | SM3.4

Deep Learning-Based Earthquakes Localization at Campi Flegrei via Distributed Acoustic Sensing 

Miriana Corsaro, Léonard Seydoux, Gilda Currenti, Flavio Cannavò, Simone Palazzo, Martina Allegra, Philippe Jousset, Michele Prestifilippo, and Concetto Spampinato

The current phase of unrest of the Campi Flegrei caldera (Italy), one of the most dangerous volcanic complexes in the world, requires increasingly rapid and high-resolution seismic monitoring solutions. In this context, Distributed Acoustic Sensing (DAS) has recently emerged as a highly innovative technology, enabling existing fiber-optic cables to be repurposed into ultra-dense seismic arrays capable of sampling the seismic wavefield with unprecedented spatial resolution.

In this study, we present a new earthquake-localization method that uses automatically identified P- and S-wave arrivals on DAS data to localize seismic events. Employing Transformer-based architectures designed to process DAS's high-dimensional strain data, our approach simultaneously estimates key source parameters, including hypocentral location, magnitude, and origin time. A comparative analysis against the official seismic catalogue reveals minimal residuals, validating the model's robustness. 

The model therefore represents a significant advancement, as it enables reliable earthquake localization in extremely short time frames using exclusively automatically picked data, while simultaneously overcoming the computational bottlenecks typical of traditional processing workflows. As a result, this methodology establishes a new benchmark for real-time monitoring of magmatic and hydrothermal systems, substantially contributing to improved seismic hazard assessment.

How to cite: Corsaro, M., Seydoux, L., Currenti, G., Cannavò, F., Palazzo, S., Allegra, M., Jousset, P., Prestifilippo, M., and Spampinato, C.: Deep Learning-Based Earthquakes Localization at Campi Flegrei via Distributed Acoustic Sensing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13315, https://doi.org/10.5194/egusphere-egu26-13315, 2026.

EGU26-13382 | ECS | Posters on site | SM3.4

Towards ambient noise tomography on long telecommunication cables: using DAS for characterisation of the seismo-acoustic soundscape in the Atlantic Ocean and Irish Sea 

Rosa Vergara González, Nicolas Luca Celli, Christopher J. Bean, Marco Ruffini, and Örn Jónsson

The oceans are a noisy place, where ships, waves, storms, currents, earthquakes and marine wildlife all leave their own seismo-acoustic signatures. Fibre sensing has the potential to allow researchers to utilise the thousands of sea-bottom telecommunication fibre-optic cables spread across the globe, and with them, we can record, characterise and monitor these signals from up close. However, at present sensing equipment limitations, lack of established fibre-sensing workflows and access to cables severely limit the use of this technology in the seas.

Here, we present and analyse Distributed Acoustic Sensing (DAS) data newly recorded on long, telecom fibre-optic cables offshore through the east and west coasts of Ireland. The availability of these two different datasets allows us to compare different environments and physical phenomena across a large region. The eastern cable covers 118 km from Dublin, Ireland to Holyhead, Wales with 36 days of data recorded in Spring 2025, while the western one reaches 72 km offshore from Galway, with 60 days of data in Autumn 2025. These datasets form part of a much larger compendium, including data from approximately 300km of onshore fibre-optic cables between both shores. Thanks to the large cable lengths and long recording times, we observe a plethora of short-lived, high frequency signals such as ships, anthropogenic noise, and local earthquakes, as well as long-wavelength, long-period signals such as ocean storms and microseisms, tides, and teleseismic events.

To characterise observations in these noisy environments, we compare our observations with nearby land seismic stations and weather records to track storm systems and wave height. We identify and separate the different seismic and acoustic sources observed, resulting in a preliminary catalogue of dominant signal types observed along the cables. The results are utilised to highlight the differences between the two marine environments and separate marine, seismic and anthropic transient signals from ambient noise. This is key to improve our understanding of ocean processes and to build datasets suitable for deep Earth sensing through Ambient Noise Tomography. While our focus is seismic, characterising marine seismic and acoustic phenomena is key in applications well beyond this field, from telecommunication fibre cable safety, to marine biology and oceanographic applications.

How to cite: Vergara González, R., Celli, N. L., Bean, C. J., Ruffini, M., and Jónsson, Ö.: Towards ambient noise tomography on long telecommunication cables: using DAS for characterisation of the seismo-acoustic soundscape in the Atlantic Ocean and Irish Sea, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13382, https://doi.org/10.5194/egusphere-egu26-13382, 2026.

EGU26-13416 | ECS | Posters on site | SM3.4

Temperature and strain monitoring in Reykjanes geothermal field, Iceland, using quasi-distributed fiber-optic sensing 

Julien Govoorts, Corentin Caudron, Jiaxuan Li, Haiyang Liao, Christophe Caucheteur, Yesim Çubuk-Sabuncu, Halldór Geirsson, Vala Hjörleifsdóttir, Kristín Jónsdóttir, and Loic Peiffer

Since December 2023 and after 800 years of inactivity, recurrent volcanic eruptions are taking place at the Svartsengi volcanic system indicating the start of a new volcanic cycle. In contrast, the Reykjanes volcanic system, located to the west of Svartsengi, has remained dormant since the 13th century.  The Reykjanes geothermal area, in particular the Gunnuhver geothermal field, is located at the westernmost end of the Reykjanes Peninsula. This geothermal area is associated with the upflow of seawater-derived hydrothermal fluids and characterized by numerous geothermal features, including steam vents and steam-heated mud pools.

Since October 2022, this geothermal field has been continuously monitored using a variety of technologies to record parameters such as soil temperature, strain and electrical resistivity. The present study focuses primarily on the parameters gathered from August 2024 using the Fiber Bragg Grating (FBG) technology, a point fiber-optic sensing approach. This technique utilizes wavelength-division multiplexing, meaning the fiber is capable of transmitting information at distinct wavelengths. Consequently, given that each FBG possesses its own wavelength, the fiber is transformed into a cost-effective and versatile quasi-distributed sensor.

Over the course of a year, the FBG interrogator deployed on-site has measured the wavelength changes at a sampling frequency ranging from 0.4Hz to 1Hz. These changes were recorded from 24 different temperature probes and 8 strain sensors both buried in-ground throughout the geothermal field. Most of the temperature sensors were installed in areas of the soil where no geothermal surface manifestation was present. These sensors recorded temperature changes primarily driven by variations in atmospheric temperature. In contrast, the remaining sensors were directly located in altered areas or close to steam vents. These sensors exhibit clear cooling patterns due to precipitation but do not show temperature changes that can be attributed to the eruption cycle. Additionally, the FBG temperature sensors allow the identification of fiber sections that are coupled to air temperature fluctuations along a telecom fiber deployed a few hundred meters north and monitored by a Distributed Acoustic Sensing (DAS) interrogator.

In addition to the temperature probes, the strain sensors have recorded signals ranging from periodic dynamic strain changes attributed to industrial processes, to static strain changes assigned to crustal deformation. On April 1, 2025, a volcanic eruption occurred in the Svartsengi volcanic system, resulting in strain variations observed 15 kilometers away from the eruption site using FBG and low-frequency components of DAS recordings. These variations were also observed in strain measurements obtained from permanent network GNSS stations. This experiment demonstrates the capacity and reliability of the FBG technology for monitoring temperature changes and deformation signals in an active geothermal environment.

How to cite: Govoorts, J., Caudron, C., Li, J., Liao, H., Caucheteur, C., Çubuk-Sabuncu, Y., Geirsson, H., Hjörleifsdóttir, V., Jónsdóttir, K., and Peiffer, L.: Temperature and strain monitoring in Reykjanes geothermal field, Iceland, using quasi-distributed fiber-optic sensing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13416, https://doi.org/10.5194/egusphere-egu26-13416, 2026.

EGU26-13921 | ECS | Orals | SM3.4

Seismic Characterisation of an Arctic Glacier 

Tora Haugen Myklebust, Martin Landrø, Robin André Rørstadbotnen, and Calder Robinson

In recent years, Distributed Acoustic Sensing (DAS) has emerged as a cost-effective seismic monitoring tool for cryosphere research. Compared to conventional geophone arrays, the DAS system is compact, easy to transport, and can be rapidly deployed over large distances in glaciated environments.

Previous studies have demonstrated that DAS is a useful tool for ice-sheet imaging and monitoring glacier dynamics. For example, using borehole DAS in conjunction with surface explosives (e.g., Booth et al., 2022; Fitchner et al., 2023) or passive recordings using surface DAS (e.g., Walter et al., 2020; Gräff et al, 2025). Significant progress has been made in applying surface DAS for active marine subsurface imaging (e.g., Pedersen et al., 2022; Raknes et al., 2025). We extend this approach to active englacial and subglacial imaging on Slakbreen, Svalbard.

During a multi-geophysical field campaign in March 2025, we acquired seismic data using surface explosives along an approximately 2 km fibre co-located with a vertical-component geophone array. We process different reflected modes (PP and PS) recorded on the fibre and benchmark the imaging results against the equivalent PP-image from the geophone array. We evaluate differences in wavefield sensitivity across the three datasets and we will present how these can be used to characterise the state of the cryosphere and deeper sedimentary successions.

Despite the relative immaturity of DAS for glacier imaging and current limitations of the processing workflow, our results clearly establish surface DAS as a viable monitoring tool for seismic imaging of the cryosphere and as a potential enabler of large-scale seismic monitoring of glaciers and the subsurface.

 

References:

Booth, A. D., P. Christoffersen, A. Pretorius, J. Chapman, B. Hubbard, E. C. Smith, S. de Ridder, A. Nowacki, B. P. Lipovsky, and M. Denolle, 2022, Characterising sediment thickness beneath a greenlandic outlet glacier using distributed acoustic sensing: preliminary observations and progress towards an efficient machine learning approach: Annals of Glaciology, 63(87-89):79–82.                                                                                                                                                   

Fichtner, A., C. Hofstede, L. Gebraad, A. Zunino, D. Zigone, and O. Eisen, 2023, Borehole fibre-optic seismology inside the northeast greenland ice stream: Geo-physical Journal International, 235(3):2430–2441.

Gräff, D., B. P. Lipovsky, A. Vieli, A. Dachauer, R. Jackson, D. Farinotti, J. Schmale, J.-P. Ampuero, E. Berg, A. Dannowski, et al., 2025, Calving-driven fjord dynamics resolved by seafloor fibre sensing: Nature, 644(8076):404–412.

Pedersen, A., H. Westerdahl, M. Thompson, C. Sagary, and J. Brenne, 2022, A north sea case study: Does das have potential for permanent reservoir monitoring? In Proceedings of the 83rd EAGE Annual Conference & Exhibition, pages 1–5. European Association of Geoscientists & Engineers.

Raknes, E. B., B. Foseide, and G. Jansson, 2025, Acquisition and imaging of ocean-bottom fiber-optic distributed acoustic sensing data using a full-shot carpet from a conventional 3d survey: Geophysics, 90(5):P99–P112.

Walter, F., D. Gräff, F. Lindner, P. Paitz, M. Köpfli, M. Chmiel, and A. Fichtner,2020, Distributed acoustic sensing of microseismic sources and wave propagation in glaciated terrain: Nature communications, 11(1):2436.

How to cite: Myklebust, T. H., Landrø, M., Rørstadbotnen, R. A., and Robinson, C.: Seismic Characterisation of an Arctic Glacier, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13921, https://doi.org/10.5194/egusphere-egu26-13921, 2026.

EGU26-14230 | ECS | Orals | SM3.4

Unveiling type of fiber and coupling conditions effects on geophysical DAS measurements, results from underground experiments 

Vanessa Carrillo-Barra, Diego Mercerat, Vincent Brémaud, Anthony Sladen, Olivier Sèbe, Amaury Vallage, and Jean-Paul Ampuero

Optical fiber measurements have been demonstrated to be useful in assessing geophysical near-surface parameters and in detecting seismological events in newly accessible regions (e.g. cities, ocean floor, highways) by leveraging the existing fiber-optic infrastructure. In particular, laser interferometry performed with DAS systems (Distributed Acoustic Sensing) allows measuring the cable axial strain related to passing seismo-acoustic waves, at any point along the fiber and over tens of kilometers of cable.

However, compared to traditional seismic sensors the instrumental response of DAS remains unclear, and there is in particular a critical need to better understand how the measurements are influenced by the nature of the fiber optic cable and its coupling to the ground or medium under study. To explore this question, we present results from two active seismic campaigns carried out in the low-noise  underground tunnel LSBB (Laboratoire Souterrain à Bas Bruit), in southeastern France.

We recorded multiple active sources (TNT detonations and hammer shots) by a 10km and 2km long underground optical fiber set-ups and with conventional seismic sensors as well. We tested along both campaigns different optical fiber cable designs and different types of coupling conditions (sealed, sandbags weighted, freely posed) installed in parallel. This experimental setup provides a unique opportunity to examine in detail and quantify the possible variations in the strain signals recovered from DAS data.

Preliminary observations reveal significant discrepancies in the recorded data depending on the coupling conditions. The characteristics of the deployed source result in a signal that is primarily concentrated in the high-frequency range, for which the sealed fiber does not necessarily exhibit a significantly improved response. Additionally, the acoustic wave generated by the hammer-shot echo, propagating through the air, is strongly amplified in all cables covered by sandbags. We propose that the sandbags increase the interaction area between that signal and the cables, thereby enhancing reverberation.

Furthermore, we observe systematic differences in the maximum amplitudes recorded by the different cables tested, with the telecom cable consistently exhibiting lower amplitudes than other specialized cables, suggesting a lower sensitivity. However, this reduction is relatively modest, and when combined with the substantially lower cost of telecom cables, indicates that they remain a cost-efficient alternative for certain experiments. Additional observations and detailed analyses from this study will be presented.

 

Keywords: Coupling, fiber optics, DAS measurements, strain rate, active seismic, LSBB.

How to cite: Carrillo-Barra, V., Mercerat, D., Brémaud, V., Sladen, A., Sèbe, O., Vallage, A., and Ampuero, J.-P.: Unveiling type of fiber and coupling conditions effects on geophysical DAS measurements, results from underground experiments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14230, https://doi.org/10.5194/egusphere-egu26-14230, 2026.

EGU26-15142 | ECS | Orals | SM3.4

Toward Global-Scale Submarine Fiber Sensing: Early Results from Multispan DAS at the OOI Regional Cabled Array 

Zoe Krauss, Bradley Lipovsky, Mikael Mazur, William Wilcock, Nicolas Fontaine, Roland Ryf, Alex Rose, William Dientsfrey, Shima Abadi, Marine Denolle, and Renate Hartog

A recently developed multispan distributed acoustic sensing (multispan-DAS) technique from Nokia Bell Labs enables strain measurements along submarine fiber-optic cables across multiple repeater-separated spans. By leveraging the high-loss loopback couplers within optical repeaters, this technique overcomes the long-standing limitation of conventional DAS to the first span of a repeated cable, typically < 100 km offshore. Dense, continuous arrays of seafloor strain sensors can now extend to hundreds or thousands of kilometers. This technique has been used to successfully record the 2025 M8.8 Kamchatka earthquake and tsunami at teleseismic range with a spatial resolution of ~100 m across 4400 km of a repeated submarine cable.

In November 2025, the multispan-DAS system from Nokia Bell Labs was deployed for three months on both repeated submarine cables of the Ocean Observatories Initiative Regional Cabled Array (OOI RCA) offshore Oregon. The deployment traverses the Cascadia subduction zone forearc and extends approximately 500 km offshore to Axial Seamount. During this period, the first span of the southern cable was simultaneously interrogated using a multiplexed conventional DAS unit, while data continued to stream from co-located cabled seismometers, hydrophones, and other oceanographic instruments on the OOI RCA.

The multispan-DAS system recorded a regional earthquake beyond the first repeater of both cables during testing as well as the ambient seafloor seismic wavefield, demonstrating sensitivity to a broad range of seismic, oceanographic, and acoustic signals. These observations provide a unique opportunity to directly compare multispan-DAS measurements with conventional DAS and established seafloor instrumentation across a large spatial extent. The resulting dataset will be publicly released following documentation and quality control. We will present preliminary results characterizing the noise floor, sensitivity, and signal fidelity of multispan-DAS relative to co-located sensors, and examine the consistency of observed seismic and oceanographic signals across measurement modalities. These results will highlight the potential of multispan-DAS for applications including routine earthquake monitoring, earthquake early warning, and broader seafloor observation, and represent an important step toward establishing this technique as a new tool for the seismological and oceanographic communities.

How to cite: Krauss, Z., Lipovsky, B., Mazur, M., Wilcock, W., Fontaine, N., Ryf, R., Rose, A., Dientsfrey, W., Abadi, S., Denolle, M., and Hartog, R.: Toward Global-Scale Submarine Fiber Sensing: Early Results from Multispan DAS at the OOI Regional Cabled Array, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15142, https://doi.org/10.5194/egusphere-egu26-15142, 2026.

EGU26-15227 | Posters on site | SM3.4

Enhancing Earthquake Location in the Central Apennines (Italy): A Hybrid Approach Combining Arrivals from Line-Sensor Telecom Fiber Interferometry and Traditional Point-sensors 

Diana Latorre, Cecilia Clivati, André Herrero, Anthony Lomax, Raffaele Di Stefano, Simone Donadello, Aladino Govoni, Maurizio Vassallo, and Lucia Margheriti

The integration of existing telecommunication fiber-optic infrastructure into seismic monitoring networks offers a transformative opportunity to densify observations in seismically active regions. We present the results of a multi-year monitoring experiment (2021–2026) utilizing a 39-km telecom fiber link from the Italian telecommunication company Open Fiber between Ascoli Piceno and Teramo in the Central Apennines, Italy. The system employs an ultra stable laser to measure seismic-induced deformation of the fiber, operating on a dedicated wavelength in coexistence with commercial data traffic.

A significant challenge in utilizing fiber-optic data for earthquake location is the transition from traditional point-sensor geometry to distributed sensing. To address this, we implemented a hybrid localization approach using a modified version of the NonLinLoc (NLL) algorithm. We move beyond traditional discrete measurements (point sensors) by treating the cable as a continuous "line sensor." Following the NLL algorithm, the most effective strategy is translating both point and line geometries into a unified framework of 3D travel-time maps. Once the sensors are translated into these maps, their combined use for location becomes independent of the sensor type, allowing for a seamless merging of traditional seismic station data and fiber-optic pickings. 

We applied this methodology to the real seismic catalog recorded from the fiber's installation in mid 2021 until January 2026 in the Ascoli-Teramo area, a region where the Italian seismic network is relatively sparse. Specifically, we analyzed signals from: 1) several small seismic sequences occurring at short distances (up to approximately 20 km) from the fiber cable, including the Civitella del Tronto (TE) sequence that followed a Mw 3.9 event (September 22, 2022); and 2) more distant earthquakes (ranging from approximately 20 to 50 km from the fiber) with local magnitudes exceeding ML 2.5, distributed along the Central Apennines axis. For events where the fiber signal allowed for the correct identification of P- and S-wave arrival times, we applied the NLL algorithm using the integrated network. In this work, we present several of these examples and associated tests to discuss how the inclusion of fiber-derived arrival times can provide further hypocentral constraints. This study aims to highlight the scalability of fiber interferometry combined with non-linear inversion as a robust tool for seismic surveillance in populated and high-risk tectonic environments.

How to cite: Latorre, D., Clivati, C., Herrero, A., Lomax, A., Di Stefano, R., Donadello, S., Govoni, A., Vassallo, M., and Margheriti, L.: Enhancing Earthquake Location in the Central Apennines (Italy): A Hybrid Approach Combining Arrivals from Line-Sensor Telecom Fiber Interferometry and Traditional Point-sensors, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15227, https://doi.org/10.5194/egusphere-egu26-15227, 2026.

EGU26-16522 | ECS | Posters on site | SM3.4

Detecting Microseismic Events Using Cross-Fault Borehole DAS 

Chih-Chieh Tseng, Hao Kuo-Chen, Li-Yu Kan, Sheng-Yan Pan, Wei-Fang Sun, Chin-Shang Ku, and Ching-Chou Fu

Microseismic events account for the majority of seismicity, however, sparse station spacing hinders the detection of such small events. In recent decades, distributed acoustic sensing (DAS) has shown its power to provide a denser spatial sampling in an array sense, to resolve weak signals that are often missed by conventional seismometers. In eastern Taiwan, the Chihshang fault plays a key role in accommodating deformation along the Longitudinal Valley fault system, where frequent small earthquakes and fault creep occur. In this study, we develop a new workflow for microseismic event detection by integrating borehole DAS data with the deep-learning-based automatic phase picking model PhaseNet. An event is declared when more than 75% of channels record P-wave picks and more than 30% record S-wave picks within a 1-s time window. We analyzed three months of DAS data from March to July 2025. As a result, we identified approximately twice as many events as those reported in a deep-learning-based earthquake catalog constructed using only surface seismic stations. These results suggest that borehole DAS provides an effective complementary constraint for detecting earthquake-generated wave trains. This processing workflow can significantly improve the detection capability for microseismic events, leading to higher seismic catalog completeness and finer fault structure near the Chihshang region.

How to cite: Tseng, C.-C., Kuo-Chen, H., Kan, L.-Y., Pan, S.-Y., Sun, W.-F., Ku, C.-S., and Fu, C.-C.: Detecting Microseismic Events Using Cross-Fault Borehole DAS, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16522, https://doi.org/10.5194/egusphere-egu26-16522, 2026.

EGU26-16913 | ECS | Posters on site | SM3.4

Cross-validating Distributed Acoustic Sensing and Seismic Records for Shallow Ground Motion and Near-Surface Properties 

Marco Pascal Roth, Xiang Chen, Gian Maria Bocchini, and Rebecca M Harrington

Distributed Acoustic Sensing (DAS) offers dense spatial sampling of ground motion and has the potential to perform detailed seismic monitoring and constrain shallow velocity structure. In this study, we analyze ground motion recorded by broadband seismometers and a fiber-optic interrogator of two shallow tectonic earthquakes in the Roerdalen region (The Netherlands–Germany border) with local magnitudes ML 2.2 (2025-09-09) and ML 1.9 (2025-09-15) and hypocentral depths of ~15 km to quantify the differences in sensitivity and magnitude estimates from each type of instrumentation. The Distributed Acoustic Sensing (DAS) recordings consist of ground strain sampled at 250 Hz on a 30 km telecommunications dark-fiber with a channel spacing of 5 m and a gauge length of 50 m. Seismometer recordings consist of ground velocity sampled at 100 Hz on a Trillium Compact 20 s seismometer that has a flat frequency response up to ~100 Hz. Both types of sensors recorded the earthquakes with a minimum epicentral distance of ~20 and 10 km, respectively. We will present results showing the differences in frequency sensitivity, conversions to ground displacement, and estimated magnitudes, as well as an interpretation of differences based on the shallow ground velocity. 

We first convert DAS recordings that are initially measured in strain to ground displacement using a semblance-based approach, as well conventional seismic recordings initially recorded in velocity. We make a quantitative comparison of waveform characteristics, including amplitude-frequency dependence and its variability in space for point-wise seismic sensor measurements vs. DAS measurements. We will present an interpretation of the results based on the context of geological setting to identify spatial variations that cannot be resolved by the sparse seismic network alone. As DAS measurements reveal significant lateral variability in ground motion amplitudes that suggest a strong influence of near-surface conditions (density) and/or local coupling effects, we will also quantify the relative influence of each using a comparison of strain and converted ground displacement. In addition, we explore approaches to estimate earthquake magnitude from DAS data by relating observed strain amplitudes to ground-motion parameters derived from the co-located seismometer. Preliminary results suggest that DAS-based observations capture the relative scaling between the two events and show promise for magnitude estimation when calibrated against conventional seismic sensors. Our findings demonstrate the value of DAS for high-resolution observations of near surface properties and their influence on earthquake waveforms.  They also highlight the potential of DAS to complement existing seismic networks for monitoring small-magnitude earthquakes.  

How to cite: Roth, M. P., Chen, X., Bocchini, G. M., and Harrington, R. M.: Cross-validating Distributed Acoustic Sensing and Seismic Records for Shallow Ground Motion and Near-Surface Properties, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16913, https://doi.org/10.5194/egusphere-egu26-16913, 2026.

EGU26-17223 | ECS | Orals | SM3.4

Reimagining Seismic Array Processing with Fibre-Optic DAS: The NORFOX Array 

Antoine Turquet, Andreas Wuestefeld, Alan Baird, Kamran Iranpour, and Ravn Rydtun

NORFOX is a purpose-built fibre-optic Distributed Acoustic Sensing (DAS) installation located in southeastern Norway, approximately 150 km north of Oslo. Beyond its primary function of monitoring earthquakes and explosions, the system captures a broad range of other signals, including aircraft, thunder, and atmospheric phenomena. A key advantage of NORFOX is its overlap with the co-located NORES seismometer array, which enables direct calibration of DAS measurements against conventional seismic recordings and supports method development under well-constrained ground-truth conditions. In this contribution, we introduce the NORFOX infrastructure and array layout, discuss key design choices, and summarize practical strengths and limitations using representative examples.

NORFOX is additionally equipped with all-sky cameras operated by Norsk Meteor Nettverk for meteor monitoring, which also capture nearby lightning activity. Lightning locations provide independent timing and spatial context that help interpretation coincident acoustic signatures observed on the fibre. Together with weather information, noise-floor characterization, and optical monitoring, these observations provide a benchmark dataset for both existing and future DAS installations and calibration

We also present in-house approaches to reduce noise, understanding signals, strategies on managing data volumes and edge-computing. Furthermore, we show and interpret signals from nearby quarry blasts, regional earthquakes, thunderstorms, and aircraft. Finally, we demonstrate and evaluate DAS array-processing methodologies for earthquake and explosion monitoring at NORFOX. Overall, dedicated research fibre arrays such as NORFOX provide a controlled environment to develop, benchmark, and calibrate DAS-based monitoring workflows in combination with co-located seismic instrumentation.

How to cite: Turquet, A., Wuestefeld, A., Baird, A., Iranpour, K., and Rydtun, R.: Reimagining Seismic Array Processing with Fibre-Optic DAS: The NORFOX Array, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17223, https://doi.org/10.5194/egusphere-egu26-17223, 2026.

EGU26-17496 | ECS | Orals | SM3.4

Privacy Concerns of DAS: Eavesdropping using Neural Network Transcription 

Jack Lee Smith, Karen Lythgoe, Andrew Curtis, Harry Whitelam, Dominic Seager, Jessica Johnson, and Mohammad Belal

Distributed acoustic sensing (DAS) has transformed geophysical, environmental, and infrastructure monitoring. However, the increasing bandwidth and sensitivity of modern interrogators now extend into the audio range, introducing a material privacy risk. Here we demonstrate, through in-situ experiments on live fibre deployments, that human speech, music, and other acoustic signals can be under certain acquisition conditions.

We show that intelligible speech can be accurately recovered and automatically transcribed using neural networks. Experiments were conducted on both linear and spooled fibre geometries, deployed as part of an ongoing geophysical survey. We find that coiled layouts, which are common in access networks (e.g., slack loops or storage spools), exhibit enhanced sensitivity to incident acoustic waves relative to linear layouts. Modelling indicates this arises from increased broadside sensitivity and reduced destructive interference for longer wavelength acoustic fields over the gauge length. We systematically assess how acquisition parameters, such as source-fibre offset, influence signal‑to‑noise ratio, spectral fidelity, and speech intelligibility of recorded audio. We further show that neural network based denoising strategies improves intelligibility and fidelity of recorded audio, thereby exacerbating privacy concerns.

These findings demonstrate that appropriate interrogation of existing fibre infrastructure - including fibre‑to‑the‑premises links, smart-city infrastructure, and research cables – can function as pervasive, passive wide-area acoustic receivers, creating a pathway for inadvertent or malicious eavesdropping. We discuss practical mitigation strategies spanning survey design, interrogation configuration, and data governance, and argue that the incorporation of privacy‑by‑design into deployment and processing is crucial to leverage the unique benefits of DAS while managing emerging ethical and legal risks.

How to cite: Smith, J. L., Lythgoe, K., Curtis, A., Whitelam, H., Seager, D., Johnson, J., and Belal, M.: Privacy Concerns of DAS: Eavesdropping using Neural Network Transcription, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17496, https://doi.org/10.5194/egusphere-egu26-17496, 2026.

EGU26-17601 | Posters on site | SM3.4

Ambient signals analysis and cable coupling characterisation from a DAS experiment offshore South Brittany 

Florian Le Pape, Stephan Ker, Shane Murphy, Philippe Schnurle, Mikael Evain, Pascal Pelleau, Alexis Constantinou, and Patrick Jousset

As fibre-sensing measurements on submarine fibre optic cables become more widely used in geophysical studies, new challenges arise that demand a deeper understanding of the collected data. In particular, characterisation of cable coupling to the seafloor as well as the response of local sediment under the cables is needed for a better quantification of external physical phenomena by fibre-sensing measurements.

FiberSCOPE is a research project aiming to implement an intelligent seabed monitoring system for studies in seismology, oceanography and the positioning of acoustic manmade sources (ships, AUVs, etc.) using existing submarine fiber-optic cables. One of the main objectives of the project is to define tools for remote evaluation of fibre optic cable coupling with the seabed using both Brillouin Optical Time Domain Reflectometry (BOTDR) and Distributed Acoustic Sensing (DAS) measurements of ambient noise.

Within the project’s framework, passive and active seismic experiments were performed during March-April 2025 offshore south Brittany. The experiment included acquiring DAS measurements on the electro-optic cable connecting mainland France to Groix island, combined with the deployment of 10 seismic nodes near the cable. Preliminary results show that although ocean waves dominate the DAS signals, ocean wave induced microseisms events can be extracted as they fluctuate over the 18 days’ of the passive acquisition. Interestingly, despite the short distance covered by the offshore portion of the cable, spatial variations of those events are also observed and seem consistent between cable and nodes measurements. Finally, both ocean waves and microseism signals are used to further quantify the cable coupling with the seafloor and cable response connected to changes in seafloor structure.

How to cite: Le Pape, F., Ker, S., Murphy, S., Schnurle, P., Evain, M., Pelleau, P., Constantinou, A., and Jousset, P.: Ambient signals analysis and cable coupling characterisation from a DAS experiment offshore South Brittany, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17601, https://doi.org/10.5194/egusphere-egu26-17601, 2026.

EGU26-18270 | ECS | Posters on site | SM3.4

Assessing the Seismic Sensitivity on a Submarine Optical Fiber Link between Malta and Catania (Sicily, Italy) 

Daniele Caruana, Matthew Agius, André Xuereb, Cecilia Clivati, Simone Donadello, Kristian Grixti, and Irena Schulten

Submarine regions remain sparsely instrumented, limiting the spatial coverage of seismic monitoring in offshore environments. Recent studies have shown that optical fibers, including those actively used for telecommunications, can detect ground motion through laser interferometry. We present an ongoing evaluation of the seismic sensitivity of a 260 km optical fiber link between Malta and Catania, predominantly submerged in the Ionian Sea and continuously carrying internet traffic.

The optical-fiber recordings were analysed for signals corresponding to the arrival times of ~1500 earthquakes listed in the INGV catalogue between January 2023 and March 2025. The waveforms were manually inspected for seismic arrivals and compared to seismic data recorded on nearby land stations on Malta and Sicily. Earthquakes ranging from magnitude 1.4 to 7.9 originating from distance of 3 to 16,000 km were successfully observed. Each event was assigned a category according to signal clarity and confidence, ranging from clearly visible arrivals (category A) to non-detectable signals (category E). Preliminary results indicate that <10% of events fall into category A, 10-15% in category B, 20-25% in category C, 20-25% in category D, and >30% in category E, providing an initial characterisation of the optical-fiber cable’s sensitivity. While a majority of observations fall within lower quality categories (D-E), at least 35% of the analysed events remain robustly identifiable, highlighting the contribution of the submarine fiber to existing land-based seismic networks and extending observational coverage in submarine regions. The sensitivity of the fiber strongly depends on the earthquake magnitude-distance relationship, as expected. We compare our results with previously reported measurements on terrestrial fibers (Donadello, et al., 2024), and show that the Malta-Catania submarine cable can be a reliable new seismic tool for a submarine environment, although recording fewer high-confidence events than onshore systems.

Noise in the fiber exhibits correlations with wind and with daytime anthropogenic activity. This reduces the signal-to-noise ratio and limits the detectability of earthquakes with M<2. Ongoing data acquisition will further refine sensitivity estimates and improve the characterisation of the fiber’s seismic performance.

This study is part of the Horizon Europe–funded SENSEI project, which aims to transform fibre-optic communication networks into distributed sensors for detecting environmental and geophysical signals, improving monitoring and early warning across Europe (Project ID 101189545).

How to cite: Caruana, D., Agius, M., Xuereb, A., Clivati, C., Donadello, S., Grixti, K., and Schulten, I.: Assessing the Seismic Sensitivity on a Submarine Optical Fiber Link between Malta and Catania (Sicily, Italy), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18270, https://doi.org/10.5194/egusphere-egu26-18270, 2026.

EGU26-19501 | ECS | Posters on site | SM3.4

 Investigating subsea cable sensing for monitoring of marine life, detection of earthquakes and tsunamis with Research and Education network infrastructure 

Shima Ebrahimi, Layla Loffredo, Alexander van den Hil, and Richa Malhotra

Recent advances in fibre-optic sensing enable subsea telecommunication cables to function as large-scale, distributed environmental sensors. Techniques such as Distributed Acoustic Sensing (DAS), State of Polarisation (SOP), and interferometry transform optical fibres into continuous arrays capable of detecting seismic, acoustic, and environmental signals, offering a complementary, future-proof  approach to sparsely deployed subsea instruments. This study, conducted by SURF, the Dutch National Research and Education Network (NREN), assesses the feasibility of leveraging existing and future subsea fibre-optic network infrastructure for scientific sensing within the research ecosystem. The analysis is based on an extensive data collection effort, including 55 semi-structured interviews with international experts across geoscience, marine science, networking, and technology domains, as well as a targeted survey of research institutions, which received 20 responses from 42 invited experts. Results indicate that dry-plant sensing techniques are sufficiently mature for near-term applications, with DAS enabling kilometre-scale seismic and acoustic monitoring, while SOP and interferometry support long-range sensing over thousands of kilometres. Wet-plant approaches, including SMART cables and Fiber Bragg Grating sensors, provide high-precision measurements at extreme depths but remain limited to new cable deployments due to cost and coordination requirements. Strong alignment is observed with current needs in seismology and geophysics, particularly for offshore seismic monitoring and subsurface deformation studies, while applications in oceanography and marine biology remain exploratory. Data volume, standardisation, and real-time processing emerge as key challenges. Research networking organisations play a critical role in enabling scalable, network-centric earth and ocean observation.

How to cite: Ebrahimi, S., Loffredo, L., van den Hil, A., and Malhotra, R.:  Investigating subsea cable sensing for monitoring of marine life, detection of earthquakes and tsunamis with Research and Education network infrastructure, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19501, https://doi.org/10.5194/egusphere-egu26-19501, 2026.

EGU26-20683 | Orals | SM3.4

Distributed acoustic fibre sensing for large scientific infrastructures: ocean microseism at the European XFEL 

Celine Hadziioannou, Erik Genthe, Svea Kreutzer, Holger Schlarb, Markus Hoffmann, Oliver Gerberding, and Katharina-Sophie Isleif and the the WAVE initiative

The WAVE seismic network is a dense, multi-instrument monitoring system deployed on a scientific campus in Hamburg, Germany. It combines seismometers, geophones, and a 19 km distributed acoustic sensing fiber loop installed in existing telecommunication infrastructure. The network covers large-scale research facilities including the European X-ray Free-Electron Laser (EuXFEL) and particle accelerators at DESY. Its primary goal is to characterise natural and anthropogenic ground vibrations and to quantify how these signals couple into ultra-precise measurement infrastructures that are limited by environmental noise. Beyond local applications, WAVE serves as a testbed for fibre-optic sensing concepts relevant to fundamental physics, including seismic and strain monitoring for gravitational wave detection.

The EuXFEL is a femtosecond X-ray light source designed for ultrafast imaging and spectroscopy. Its performance depends critically on precise timing and synchronisation of the electron bunches along the linear accelerator. Measurements of bunch arrival times reveal significant noise contributions in the 0.05–0.5 Hz frequency band, with peak-to-peak timing jitter of up to 25 femtoseconds. Using distributed acoustic sensing data, we demonstrate that this jitter is largely explained by secondary ocean-generated microseism, which is identified as a significant limiting factor for stable, high-precision XFEL operation in the sub-Hz regime. 

To assess the potential for prediction and mitigation, we investigate whether ocean wave activity in the North Atlantic can be used to anticipate microseismic signals observed at the EuXFEL site. Output from the WAVEWATCH III ocean wave model is used to generate synthetic Rayleigh wave spectrograms with the WMSAN framework. These are compared to seismic observations at the EuXFEL injector. By subdividing the North Atlantic into source regions, we evaluate their relative contributions to the observed seismic wavefield. While absolute amplitude prediction remains challenging, the modelling reproduces key spectral characteristics and temporal variability.

Our results demonstrate that combining dense fibre-optic sensing with physics-based ocean wave modelling provides a framework to characterise microseismic noise and assess its limiting impact on high-precision experiments. This approach supports noise mitigation efforts at high-precision accelerator facilities and is directly relevant to future ground-based gravitational wave detectors.

How to cite: Hadziioannou, C., Genthe, E., Kreutzer, S., Schlarb, H., Hoffmann, M., Gerberding, O., and Isleif, K.-S. and the the WAVE initiative: Distributed acoustic fibre sensing for large scientific infrastructures: ocean microseism at the European XFEL, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20683, https://doi.org/10.5194/egusphere-egu26-20683, 2026.

EGU26-21683 | Posters on site | SM3.4

Leveraging Railway Fiber-Optic Networks with DAS: Multi-Scale Opportunities 

Pascal Edme, Daniel Bowden, Frederick Massin, Anne Obermann, sanket Bajad, John Clinton, and James Fern

Distributed Acoustic Sensing (DAS) enables the acquisition of seismic data with unrivalled spatio-temporal resolution over very large distances. Railway fiber-optic networks, originally deployed for telecommunications, offer cost-effective opportunities to monitor and characterize the subsurface at multiple scales. Here, we present a project conducted with the Swiss Federal Railways (SBB) involving the interrogation of dark fibers running along two perpendicular railway tracks, each approximately 40 km long. Data were acquired over three months using a dual-channel Sintela Onyx interrogator, with variable acquisition setups (spatial sampling, gauge length, and sampling frequency) tailored to different scientific objectives described below.

The primary objective was to assess the feasibility of using pre-existing telecommunications fibers for structural track-bed monitoring, specifically shallow subsurface Vs characterization through inversion of Rayleigh-wave dispersion curves (MASW). This requires high spatial sampling and short gauge length (3 m and 6 m, respectively) to capture short wavelengths. Several ambient noise interferometry strategies were tested, including stacking (1) all available time windows with various preprocessing schemes, (2) only time windows exhibiting strong directional wavefields, and (3) a coherent-source subsampling approach based on a Symmetric Variational Autoencoder to identify time windows contributing the most useful seismic energy. Unsurprisingly, trains constitute the most energetic and reliable seismic sources, from which dense Vs profiles can be derived, demonstrating the effectiveness of both the processing and inversion workflows.

Beyond shallow characterization, the experiment also yielded valuable data to complement dense nodal arrays deployed near Lavey-les-Bains, a site of significant geothermal interest and complex geological structure. The main objectives in this context are to (1) help characterizing the subsurface over the first kilometers, (2) investigate its relationship to geothermal circulation, (3) evaluate the joint use of dense nodal and DAS data for imaging, and (4) establish a high-quality, open-access dataset to support the development of next-generation passive imaging methodologies.

Finally, at an even larger scale, the experiment provided the opportunity to explore how DAS data can be leveraged within the operational Swiss Seismological Service (SED) network and to assess whether DAS can augment standard seismicity catalogues. Lower-resolution data (100 m spatial sampling, 200 Hz sampling frequency) were streamed and converted in real time into standard seismic formats (miniSEED and StationXML), demonstrating the feasibility of integrating DAS data into SeisComP for both automatic and manual processing.

We will present the dataset along with key results relevant to the three purposes outlined above.

We acknowledge Allianz Fahrbahn (grant agreement No. 100 072 202) for enabling this study.

How to cite: Edme, P., Bowden, D., Massin, F., Obermann, A., Bajad, S., Clinton, J., and Fern, J.: Leveraging Railway Fiber-Optic Networks with DAS: Multi-Scale Opportunities, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21683, https://doi.org/10.5194/egusphere-egu26-21683, 2026.

EGU26-247 | Posters on site | HS3.4

Addition of Process-Based Stream Temperature Modeling Capabilities to MODFLOW 6 

Eric Morway, Katie Fogg, Alden Provost, Christian Langevin, Joseph Hughes, and Martijn Russcher

MODFLOW is a well-known and widely used groundwater flow simulator.  Characteristics that have historically defined MODFLOW remain in place: it is open-source, freely available, well-documented, and intuitive.  A complete rewrite of MODFLOW in 2017 has facilitated the adoption of several new model types embedded directly into the MODFLOW framework.  In addition to simulating groundwater flow, MODFLOW 6 now also includes solute transport, particle tracking, and a new heat-transport model called the Groundwater Energy (GWE) transport model.  Many other enhancements are actively being developed.  As with all model types available within the MODFLOW 6 hydrologic simulator, the GWE model leverages the design concept commonly referred to as packages – modules that represent specific features of the hydrologic system being modeled.  For example, the Streamflow Routing (SFR) package can be activated to simulate flow in streams.  If desired, users also can simulate heat transport within a stream network by activating the Streamflow Energy (SFE) transport package.  The SFE package simulates advective heat transport within the stream network while also accounting for advective and conductive heat exchange with the underlying groundwater system.  Although the initial release of GWE offered basic heat transport functionality in stream networks through the SFE package, detailed representation of heat exchange between stream reaches and the atmosphere was not included.  However, recent SFE development efforts are focused on adding functionality to represent heat exchange with the atmosphere.  New processes by which heat may be exchanged with the atmosphere are short- and long-wave radiation and sensible and latent heat fluxes.  When finished, the new process-based stream temperature modeling capabilities will work with the other MODFLOW features, including the application programming interface (API), parallel simulation, the input data model (IDM), and support within the popular FloPy Python library.

How to cite: Morway, E., Fogg, K., Provost, A., Langevin, C., Hughes, J., and Russcher, M.: Addition of Process-Based Stream Temperature Modeling Capabilities to MODFLOW 6, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-247, https://doi.org/10.5194/egusphere-egu26-247, 2026.

EGU26-1167 | ECS | Posters on site | HS3.4

Enhancing Streamflow Simulations Through Input Data Denoising 

Injila Hamid and Vinayakam Jothiprakash

Hydrological models are vital for understanding water resources and their responses to environmental and climatic changes, but their accuracy depends strongly on input data quality. This study evaluates how noise reduction in meteorological inputs influences the performance of the SWAT hydrological model for the lower Columbia River basin. Wavelet Transform (WT) was applied for partial denoising, while Singular Spectrum Analysis (SSA) was used for both partial and full noise removal. SSA allows extraction of trend, periodic, and noise components individually from time series data. Results indicate that partial denoising using WT significantly improves model performance, increasing the correlation coefficient (r) and Nash–Sutcliffe Efficiency (NSE) by 2 to 5%, Kling-Gupta Efficiency (KGE) by 16%, and reducing RSR by 4%, along with a notable reduction in PBIAS (from −4.7 to +1.3). The partially denoised WT model achieved r = 0.91, NSE = 0.81, PBIAS = 1.30, KGE = 0.88, and RSR = 0.45, outperforming both the base and fully denoised models. The comparative analysis shows that completely removing noise offers limited benefits and may suppress natural variability, while partial denoising provides an optimal balance between data reliability and model precision. These findings highlight the importance of appropriate input-data preprocessing in improving hydrological model performance and reducing uncertainty in water resource assessments.

How to cite: Hamid, I. and Jothiprakash, V.: Enhancing Streamflow Simulations Through Input Data Denoising, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1167, https://doi.org/10.5194/egusphere-egu26-1167, 2026.

EGU26-1351 | ECS | Orals | HS3.4

Geostatistical active learning for expanding monitoring networks for environmental decision making 

Felix Henkel, Jonathan Frank, Thomas Suesse, and Alexander Brenning

The expansion and optimisation of environmental monitoring networks requires the efficient use of limited resources to improve spatial predictions to ensure the protection of human health and ecosystems.

Network densification is a spatial sampling problem that is often addressed by pointwise-prediction uncertainty approaches, which ignore (1) the impact of a new site on its neighbourhood and (2) the binary decision task motivating the monitoring. Active learning (AL) is a machine learning technique that iteratively selects new locations based on the current maximum uncertainty in the available training data. We therefore recast network densification as an AL task and propose model-agnostic acquisition criteria, including a decision-aligned focal logit criterion that prioritises neighbourhoods whose exceedance probabilities lie near regulatory thresholds. A look-ahead criterion based on the expected reduction in prediction standard error (SE) is also examined. In a groundwater nitrate concentration case study, the focal logit criterion consistently selected more informative sites than traditional dispersion- or prediction-SE-based criteria, yielding up to 58 % greater gains in exceedance-mapping accuracy (Cohen’s κ)). Focal logit and SE criteria outperformed pointwise counterparts by ~45 % on average, while the look-ahead criterion performed well but at much higher computational cost.

The proposed framework is simple, generalisable to other environmental pollutants (such as air pollutants), and supports a transparent, decision-oriented monitoring design.

How to cite: Henkel, F., Frank, J., Suesse, T., and Brenning, A.: Geostatistical active learning for expanding monitoring networks for environmental decision making, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1351, https://doi.org/10.5194/egusphere-egu26-1351, 2026.

Decisions concerning the management of natural resources are often based on binary criteria that determine whether a specific environmental target is met or exceeded. A common example is the designation of “polluted” areas, where mitigation measures must be implemented once concentrations surpass a regulatory threshold. In practice, maps of such exceedances are commonly derived from regionalized concentration estimates. However, most conventional spatial interpolation and prediction procedures introduce systematic bias in the estimated extent of polluted areas.

To overcome this issue, we apply a bias-corrected mapping procedure that is compatible with any geostatistical or machine learning method capable of providing valid probability estimates. For the case study, we mainly focus on a trans-Gaussian regression-kriging (TRGK) framework, selected for its interpretability and transparent decomposition of predictions. To assess the potential added value of nonparametric approaches, we additionally compare TRGK with quantile regression forest (QRF) in a sub-region.

The TRGK model follows a structured, non-stationary design: (i) raw concentrations are transformed to log10 scale; (ii) a nationwide global linear model captures broad-scale relationships; (iii) major hydrogeological districts serve as units for local linear refinements to account for non-stationarity; (iv) residuals are transformed using a Gaussian anamorphosis; and (v) the transformed residuals are interpolated via ordinary kriging, from which probability estimates are derived. This setup improves flexibility while maintaining interpretability and coherent uncertainty quantification.

Bias correction is performed by estimating the total exceedance area implied by the data and determining a calibrated probability threshold that ensures an unbiased delineation of the polluted area. In this study, we jointly evaluate a threshold exceedance criterion and a temporal trend criterion.

Groundwater nitrate mapping at national scale represents a challenging test case due to strong non-normality, spatial heterogeneity, and pronounced non-stationarity. The approach nonetheless performs robustly. Linear model components exhibit R2 values between 0.15 and 0.62, while semivariogram practical ranges vary from 0.3 to 22.3 km. In the sub-region comparison, QRF showed a small discrimination advantage over TRGK (AUC 0.86 vs. 0.82) but relied more heavily on calibration (underestimation without calibration 94.9% vs. 5.1%).

Overall, the results demonstrate that the bias-corrected probability-based framework provides a flexible, robust and- when coupled with geostatistics- transparent solution for large-scale pollution mapping.

How to cite: Frank, J., Suesse, T., Jiang, S., and Brenning, A.: Bias-corrected pollution mapping with non-stationary geostatistics and spatial machine learning for environmental decision making: The case of groundwater nitrate, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1406, https://doi.org/10.5194/egusphere-egu26-1406, 2026.

Hydrological extreme records in many regions in the world may include observations from different genesis and levels of extremeness forming a characteristic “separation phenomenon’ that limits the effectiveness of traditional distributions such as the Gumbel and log-Pearson Type III models, and in such mixed extreme populations, the Two-Component Extreme Value (TCEV) distribution is better suited. However, conventional fitting approaches tend to emphasize the abundant ordinary data because of the scarcity of right-tail observations, which results in inaccurate predictions of high quantiles. Nevertheless, accurate representation of the upper tail (i.e., the high-value ranges of the cumulative distribution function, CDF) is essential for flood risk evaluation and the design of hydraulic structures. To address this issue, this study introduces a new TCEV fitting approach (SR-MWS) aimed at improving right-tail performance. In the new proposal, the dataset is first approximated using a piecewise two linear regression, and the slope ratio between the two parts (R = S1/S2) is used to assess whether TCEV modeling is appropriate or not (if R > 1.5, the dataset is regarded as suitable for TCEV fitting). Following, three weighting strategies—linear, quadratic, and exponential—are applied sequentially to obtain the final TCEV parameters. A partitioned scoring framework is then used to select the most suitable weighting scheme, emphasizing the mid-to-upper CDF range F(x) ∈ [0.6, 1.0], which corresponds to return periods from about 2.5 years to more than 200 years, while also considering overall fit quality. Our results show that the proposed method yields more accurate estimates for extreme values than conventional techniques and exhibits consistent performance for both peak-flow and precipitation datasets. Beyond hydrological applications, it provides an automated and robust tool for modeling extreme events and supporting risk assessment in fields characterized by mixed-population data with a pronounced dog-leg structure.

How to cite: Valdes-Abellan, J., Ta, L., and Yu, C.: New Proposal for maximum hydrological events fitting showing the ‘separation phenomenon’ with flexible TCEV Distribution , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1449, https://doi.org/10.5194/egusphere-egu26-1449, 2026.

EGU26-1813 | Orals | HS3.4

Residual error modelling for hourly streamflow predictions 

Cristina Prieto, Dmitri Kavetski, Fabrizio Fenicia, James Kirchner, David McInerney, Mark Thyer, and César Álvarez

 Statistical residual error modelling for hourly streamflow predictions

Cristina Prieto1,2,3, Dmitri Kavetski4,1, Fabrizio Fenicia3, James Kirchner2,5,6, David McInerney4, Mark Thyer4, and César Álvarez1

 

(1) IHCantabria—Instituto de Hidráulica Ambiental de la Universidad de Cantabria, Santander, Spain

(2) Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland

(3) Eawag, Swiss Federal Institute of Aquatic Science and Technology, Dübendorf, Switzerland

(4) School of Civil, Environmental and Mining Engineering, University of Adelaide, Adelaide, SA, Australia

(5) Swiss Federal Research Institute WSL, Birmensdorf, Switzerland

(6) Department of Earth and Planetary Science, University of California, Berkeley, California, USA

 

Water plays a critical role in societal stability through both its excess and scarcity. Extreme hydrological events can cause substantial human and economic losses, while water scarcity affects essential services such as drinking water supply, food production, and hydropower generation. Reliable streamflow predictions are therefore fundamental for environmental assessments, flood risk management, and Integrated Water Resources Management (IWRM).

Hydrological models are central tools for understanding catchment behaviour and generating predictions to support water-resources assessment, planning, and management. However, their predictive performance strongly depends on the temporal resolution at which they are applied.

At hourly time scales, hydrological processes and associated uncertainties become markedly more complex, particularly in small and mesoscale catchments. Flood peaks may last only a few hours, so daily streamflow predictions can substantially underestimate peak magnitudes; antecedent wetness conditions can evolve rapidly; and the dominant processes controlling short-term streamflow dynamics differ from those governing longer term behavior. For example, over longer time scales, predictions are primarily constrained by mass balance, whereas short-term predictions depend more strongly on dynamics and flow routing.

In addition to classical sources of uncertainty related to data, model structure, and parameters, hourly streamflow predictions often exhibit bias, heteroscedasticity, temporal autocorrelation, and non-stationarity.

Despite their importance, hourly streamflow prediction and uncertainty characterisation have received comparatively less attention than daily-scale studies.

In this work, we use a conceptual hydrological model to generate deterministic hourly streamflow predictions and quantify predictive uncertainty using a residual error modelling framework. Case-study catchments include hydrologically diverse basins in Europe and the United States. Bias, heteroscedasticity, and temporal dependence in model residuals are addressed using Box–Cox transformations and autoregressive and moving average (ARMA) models.

Results indicate that a logarithmic transformation combined with an autoregressive model of order three (AR(3)) provides the most consistent performance across catchments. This work advances streamflow prediction by developing statistically rigorous methods for post-processing the residuals of conceptual hydrological models at the hourly time scale, supporting more reliable hourly streamflow predictions for integrated water resources management and decision-making.

How to cite: Prieto, C., Kavetski, D., Fenicia, F., Kirchner, J., McInerney, D., Thyer, M., and Álvarez, C.: Residual error modelling for hourly streamflow predictions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1813, https://doi.org/10.5194/egusphere-egu26-1813, 2026.

EGU26-3193 | ECS | Orals | HS3.4

Designing Sampling Strategies for the Efficient Estimation of Parameterized Spatial Covariance Models 

Olivia L. Walbert, Frederik J. Simons, Arthur P. Guillaumin, and Sofia C. Olhede

Spatial data in the Earth and environmental sciences acquired by instrument collection or simulation are constrained to finite, discrete, (ir)regular grids whose geometry is delineated by a boundary within which missingness, either random or structured, may exist. We model (ir)regularly sampled Cartesian spatial data as realizations of discrete two- and three-dimensional random fields whose covariance structure we estimate parametrically with a spectral-domain maximum-likelihood estimation strategy using the debiased Whittle likelihood, which efficiently counters the effects of aliasing and spectral leakage that arise from finite sampling and boundary effects. We work with the general, flexible Matérn class of covariance functions, which characterizes the shape of a field through three parameters that quantify its amplitude, smoothness, and correlation length. We quantify parameter covariance analytically and asymptotically based on the parametric model and sampling grid alone, agnostic of observed data. Our uncertainty quantification allows us to study how sampling geometry imparts uncertainty on a covariance model and provides a path for optimizing the design of a sampling grid to reduce error for an anticipated model. We formulate several approaches for interrogating our model residuals to interpret where real Earth data depart from the null hypotheses of Gaussianity, stationarity, and isotropy. We explore select case studies that demonstrate the broad applicability of our models across Earth science disciplines and develop software in MATLAB and Python for implementation by domain scientists, in hydrology, and elsewhere.

How to cite: Walbert, O. L., Simons, F. J., Guillaumin, A. P., and Olhede, S. C.: Designing Sampling Strategies for the Efficient Estimation of Parameterized Spatial Covariance Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3193, https://doi.org/10.5194/egusphere-egu26-3193, 2026.

Spatial and temporal datasets that comprise distributions of events along a transect/timeline together with their magnitudes can display scale-dependent changes in persistence or anti-persistence that may contain signatures of underlying physical processes. Lacunarity is a technique that was originally developed for multiscale analysis of data and characterizes the distribution of spaces or gaps in a pattern as a function of scale. In this study, we demonstrate how lacunarity may be modified in order to reveal scale-dependent changes in 1-dimensional data related to fractures, sedimentary layering and rainfall. In order to address whether fractures found along a 1-dimensional transect (scanline) occur in clusters, we compare the lacunarity of a given fracture-spacing data to a theoretical random lacunarity curve. Further, we introduce the concept of 1st derivative of log-transformed lacunarity and demonstrate that this function can find the inter-cluster spacing and possible fractal behaviour over certain scales. It will be demonstrated how this same technique may be applied to a time-series, e.g., rainfall data, to see whether such events occur in clusters over certain time-scales. Next, the “event magnitudes” (e.g., fracture aperture) were added to each event data point (e.g., fracture) thus, yielding a 1-dimensional non-binary dataset and it was tested whether the dataset shows scale-dependent changes in terms of anti-persistence and persistence. The concept of lacunarity ratio, LR, is introduced, which is the lacunarity of a given dataset normalized to the lacunarity of its random counterpart. This randomization however, is different from the one used in the previous technique. In case of our fracture dataset for example, the random sequence is generated by leaving the locations of fractures unaltered and randomly reallocating the magnitudes along the dataset. It was demonstrated that LR can successfully delineate scale-dependent changes in terms of anti-persistence and persistence. In addition to the fracture data already mentioned here (spacing and apertures from NE Mexico), the one used for developing this technique, it was applied to two different types of data: a set of varved sediments from Marca Shale and, a hundred-year rainfall record from Knoxville, TN, USA. While the fracture data showed anti-persistence at small scales (within cluster) and random behavior at large scales, the rainfall data and varved sediments both appear to be persistent at small scales becoming random at larger scales. It was no surprise to find such striking similarity between the spatial “sedimentary” data and the time-dependent rainfall data because in rock records, the former is often considered to be a proxy for the latter. In general, such differences in behavior with respect to scale-dependent changes in anti-persistence to random, persistence to random, or otherwise, maybe be related to differences in the physicochemical properties and processes contributing to multiscale datasets.

How to cite: Roy, A. and Mukerji, T.: Identifying Scale-dependent Spatial and Temporal Patterns in Earth Science Data: Lacunarity-based Techniques, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4505, https://doi.org/10.5194/egusphere-egu26-4505, 2026.

EGU26-5236 | Orals | HS3.4

Which grid points are statistically significant? Revisiting false discovery rate correction in geospatial data 

Michael Schutte, Leonardo Olivetti, and Gabriele Messori

Scientific publications in the geosciences routinely assess statistical significance in spatially distributed environmental and geophysical data. When statistical significance is indicated, it is most often assessed independently at each grid point, while formal adjustment for multiple testing is rarely applied. However, applying multiple testing corrections, such as the global false discovery rate (FDR) approach is not always straightforward, as environmental and geophysical data are often spatially correlated.

In our work, we highlight how neglecting multiple testing correction can substantially inflate the number of false positives. We further show that commonly used FDR implementations can yield counterintuitive and potentially misleading results when applied to strongly spatially correlated data.

To illustrate the latter point, we provide an example based on near-surface air temperature composites following sudden stratospheric warmings. We first show that when anomalies are spatially coherent, restricting the spatial domain can increase the FDR-adjusted significance threshold. As a result, the same underlying field may display a larger share of statistically significant grid points solely due to domain selection. We analyze the origin of this behavior from a rank-based perspective and discuss its implications for spatial inference and uncertainty quantification in environmental sciences.

Based on these insights, we propose practical recommendations for robust and transparent significance assessment, such as spatially aggregated or spatially aware alternatives. Our results highlight both the need to account for multiple-testing and potential issues with a naïve application and interpretation of FDR correction. While illustrated using atmospheric data, the findings are directly relevant to hydrology and other environmental sciences where statistical significance is assessed across spatial fields.

How to cite: Schutte, M., Olivetti, L., and Messori, G.: Which grid points are statistically significant? Revisiting false discovery rate correction in geospatial data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5236, https://doi.org/10.5194/egusphere-egu26-5236, 2026.

EGU26-7544 | ECS | Posters on site | HS3.4

Toward Stable Groundwater–Surface Water Coupling in Landscape Evolution Models 

Farshid Alizadeh, Raphael Bunel, Nicolas Lecoq, and Yoann Copard

Integrated landscape-evolution models require groundwater models that are computationally efficient, groundwater component that remains stable over multidecadal simulations, and strong coupling with surface hydraulics and sediment transport. In CLiDE, which is built on CAESAR–Lisflood, the backward-Euler groundwater update is simple, but as grid resolution or hydraulic diffusivity increases, it becomes highly restrictive due to the diffusion-type Courant–Friedrichs–Lewy (CFL) stability constraint. We present a redesign of CLiDE’s groundwater module that provides two complementary pathways: a behavior-preserving optimized explicit solver and a fully implicit formulation based on backward-Euler time integration. The implicit approach uses a Picard iteration to address the nonlinearity of unconfined transmissivity and the sparse symmetric positive-definite systems with a preconditioned conjugate-gradient solver. We benchmark both solvers across 25 years in fully coupled hydro-geomorphic experiments at the 104 km² Orgeval catchment in north-central France using hourly and daily groundwater coupling intervals. The implicit solver achieves a water mass balance at the catchment scale within 0.1% while remaining unconditionally stable at daily time steps and achieving solutions comparable to the hourly implicit solution. Groundwater head diagnostics are typically within 0.01 m of each other. The consistency in outlet hydrographs, inundation patterns, and long-term sediment-export behavior indicates that daily implicit coupling, in this case, can be selected based on process time scales, and not on numerical stability. Moreover, the optimized explicit solver accelerates the legacy scheme by 1.3 to 1.6 times refinements to specific algorithms, with no change in numerical outputs. Collectively, these advances enhance CLiDE's capability for additional fully coupled, long-duration simulations and suggest a preference between efficiency-oriented explicit updates and robustness-oriented implicit integration.

How to cite: Alizadeh, F., Bunel, R., Lecoq, N., and Copard, Y.: Toward Stable Groundwater–Surface Water Coupling in Landscape Evolution Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7544, https://doi.org/10.5194/egusphere-egu26-7544, 2026.

EGU26-7835 | ECS | Orals | HS3.4

Reliable Predictive Resolution in GeospatialModelling 

Meng Lu and Jiong Wang

High-resolution geospatial prediction and satellite image downscaling are increasingly enabled by advances in machine learning and the availability of fine-scale covariates. However, predicted maps are often delivered on arbitrary grids that are not justified by the sampling density of observations. While uncertainty can be quantified at unobserved locations, the spatial scales over which predictions are supported by the data and the modelling process are typically not characterized. Besides computational and storage costs, critical consequences including over-interpretation, modelling noise, and most importantly, the apparent predictive resolution of spatial products can be misleading for downstream applications, potentially affecting scientific conclusions. An example is the use of predicted air pollution maps in health cohort studies to assess exposure–response relationships. This raises a fundamental but under-addressed question: what is the finest spatial resolution at which predictions are meaningfully supported by the data (and model)?

We investigate how to meaningfully determine the predictive resolution in regression models by linking sampling density and model parameters in the frequency domain through spectral analysis. Two challenges are 1) to identify the sampling density in the multi-dimensional feature space, where the sampling typically becomes irregular; and 2) how to relate the frequency in the feature space to the spatial resolution. Using simulated and real-world geospatial datasets, we show that some arbitrarily selected output resolutions in existing literatures could exceed the data-supported predictive resolution, and could induce unnoticed biases or change-of-support issues in downstream analyses.

How to cite: Lu, M. and Wang, J.: Reliable Predictive Resolution in GeospatialModelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7835, https://doi.org/10.5194/egusphere-egu26-7835, 2026.

Stochastic rainfall models are probabilistic tools able to simulate synthetic rainfall datasets with statistical properties that resemble those from observations, which makes them particularly suitable to assess the uncertainty of rainfall estimates and to conduct sensitivity analysis of hydro-meteorological modeling chains. When the focus of the modeling is on spatial and temporal patterns, models based on space-time Gaussian random fields (GRFs) are often used because they enable modeling rainfall at any point of the space-time domain from sparse and heterogeneous data (typically observations from a rain gauge network).

In this presentation I will explore how a new model of space-time, multivariate and non-stationary GRF can be leveraged to improve stochastic rainfall modeling. A parametric transform function is combined with the GRF to account for rainfall intermittency and skewed marginal distribution, which results in a so-called trans-Gaussian (or meta-Gaussian) model. Among the many applications achieved by this flexible trans-Gaussian model I will examine how spatial non-stationarity can model orographic effects, and how multivariate modeling can be used to embed rainfall into a stochastic weather generator including five different variables (rainfall, temperature, wind, solar radiation and humidity).

How to cite: Benoit, L.: Stochastic rainfall modeling using spatio-temporal, multivariate and nonstationary trans-Gaussian random fields, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7961, https://doi.org/10.5194/egusphere-egu26-7961, 2026.

EGU26-9994 | ECS | Posters on site | HS3.4

Probabilistic mapping of groundwater nitrate pollution using a Bayesian Gaussian process model 

Kassandra Jensch, Márk Somogyvári, and Tobias Krüger

Nitrate groundwater pollution threatens the quality of drinking water and is directly linked to intensive fertiliser inputs on agricultural fields. To reduce pollution from agricultural sources, areas with, or at risk of, elevated nitrate concentrations must be designated as Nitrate Vulnerable Zones (NVZs) under the European Nitrates Directive. In Germany, as elsewhere in Europe, the designation of NVZs follows a binary classification scheme that does not account for uncertainties in the underlying data and interpolation method. We present an alternative geostatistical framework that explicitly introduces uncertainties into the established designation framework, enabling a more accurate assessment of nitrate groundwater pollution. Using a Bayesian Gaussian process model, nitrate concentrations in groundwater were predicted across the federal state of Brandenburg, Germany, where nitrate pollution is an acute problem. Our model specifically incorporates measurement errors as well as systematic biases from different observation types. The model allows for the calculation of exceedance probabilities which provides a continuous representation of nitrate pollution risk across space, relative to the legal nitrate limit of 50 mg/L. We show that the majority of agricultural land in the study area has at least a 50% probability of exceeding this limit. Additionally, measurement errors were identified as the main source of uncertainty in estimated nitrate concentrations, leading to relatively wide posterior predictive distributions. The results indicate that areas with high exceedance probability extend beyond currently designated NVZs. Unlike the established designation workflow, the proposed approach accounts for the complex reality and uncertainty of nitrate pollution in groundwater and can be readily extended to other countries in the EU and beyond. This enables a more robust and transparent designation of NVZs, and demonstrates the value of explicitly incorporating uncertainty into environmental modelling in high-profile policy settings.

How to cite: Jensch, K., Somogyvári, M., and Krüger, T.: Probabilistic mapping of groundwater nitrate pollution using a Bayesian Gaussian process model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9994, https://doi.org/10.5194/egusphere-egu26-9994, 2026.

EGU26-12514 | ECS | Orals | HS3.4

How environmental conditions influence satellite detection of rainfall events 

Chun Zhou, Li Zhou, Luca Brocca, and Dui Huang

Precipitation serves as a critical link between climate and hydrology, with variability shaped by environmental factors that regulate satellite detection under complex conditions. Physical response mechanisms under varying temperature, soil moisture, and pressure remain insufficiently assessed. Using global gauge precipitation and ERA5-Land reanalysis data, we identified HIT, MISS, FALSE events and examined their differential responses to key environmental variables. We demonstrate that HIT events tend to occur under intermediate environmental conditions, with both products sharing similar responses but GSMaP exhibiting slightly smoother temperature signals and IMERG stronger soil-moisture-related variability. MISS events, linked to colder, wetter backgrounds, are associated with larger spread, while FALSE events arise mainly in warm, dry regimes with low soil moisture and more fluctuations in IMERG. Environmental factors modulate detection, with warmer and wetter conditions favoring HIT and suppressing FALSE, while pressure plays a weaker, secondary role. These findings support satellite-based global hydrology and climate-resilience assessment.

How to cite: Zhou, C., Zhou, L., Brocca, L., and Huang, D.: How environmental conditions influence satellite detection of rainfall events, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12514, https://doi.org/10.5194/egusphere-egu26-12514, 2026.

EGU26-13518 | Orals | HS3.4

Trend or persistence: what are we really detecting in annual low-flow time series? 

Gregor Laaha and Johannes Laimighofer

Trends in annual low-flow time series are central to water resources and drought management, yet estimates are strongly affected by serial persistence, and dependence can make persistence appear as trend. We compare nonparametric and parametric methods under short-term autocorrelation and long-term persistence (LTP) and evaluate their reliability with European streamflow data and simulation-based experiments.

For short-term autocorrelation, modified Mann–Kendall approaches with block-bootstrap-based significance correction (BBSMK) and simultaneous bias-corrected prewhitening yield robust results; alternative variants inflate significance and produce implausible findings. Parametric ARIMAX models indicate that, when analyses are based on the water year, only a small share of series require higher autoregressive orders, whereas calendar-year aggregation induces more complex correlation structures and, in turn, unreliable (too low) significance rates.

Under long-term dependence, the nonparametric Mann–Kendall–LTP approach markedly lowers the fraction of significant trends, while FARIMAX models (external trend + LTP) produce similar rates to BBSMK. Yet AIC-based selection typically replaces LTP with short-term autocorrelation, indicating that what appears as persistence is often explainable by short-range dependence.

We finally assess misclassification in parametric and nonparametric trend models under LTP using nature-based simulations across record lengths. Calibrated to stream-gauge records, the simulations test whether series with deterministic trends and short-term autocorrelation—but without true LTP—are misclassified as LTP, and how such misclassification biases trend estimates. Across four scenarios (high/low LTP × significant/non-significant trend), LTP misclassification and trend-detection errors are elevated: with a trend present, short-term autocorrelation is often mistaken for LTP, biasing estimates and reducing power. At hydrologically typical record lengths, errors remain substantial, declining only for extremely long series (1,000–10,000 years); misclassification of short-term correlation as LTP persists even then.

Overall, under common record lengths and dependence structures, deterministic trends are often misinterpreted as long-term persistence—and, conversely, genuine persistence can be mistaken for trend. Therefore, LTP-based trend analyses should be interpreted with caution; typical hydrological records are too short for reliable LTP inference.

How to cite: Laaha, G. and Laimighofer, J.: Trend or persistence: what are we really detecting in annual low-flow time series?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13518, https://doi.org/10.5194/egusphere-egu26-13518, 2026.

This study investigates multilevel flood susceptibility mapping at the national scale in North Macedonia, utilizing 328 historical flood events, 14 conditioning factors derived from a digital elevation model, simplified lithology, and computed direct runoff. The methodology integrates fuzzy set theory (Fuzzy), analytic hierarchy process (AHP), weighted linear combination (WLC), and random forest (RF) approaches. The two-stage process employs distinct sets of conditioning factors in sequential flood susceptibility mapping: first, generating Fuzzy/AHP/WLC predictions and pseudo-absence data, and second, producing five RF predictions by varying pseudo-absences and binary cutoffs. Validation results indicate that the very high susceptibility class (0.8–1.0) of the Fuzzy/AHP/WLC model predicted 46.6% of flood pixels within 31.6% of the total area. In comparison, the very high susceptibility class of the RF models predicted 88.5%, 78.3%, 60.6%, 48.5%, and 28.3% of flood pixels within 54.7%, 42.2%, 30.5%, 27.0%, and 25.1% of the total area, respectively. The RF models achieved area under the curve (AUC) values exceeding 0.850, with a maximum of 0.966. Furthermore, a standard deviation map derived from the RF models identified regions of high and low uncertainty, highlighting areas for potential methodological improvement and targeted sampling. The results also show the promise of the multilevel approach for mapping flood susceptibility and call for more research into its potential for future studies and real-world applications.

How to cite: Gorsevski, P. and Milevski, I.: Multilevel flood susceptibility mapping by fuzzy sets, analytical hierarchy process, weighted linear combination and random forest, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14278, https://doi.org/10.5194/egusphere-egu26-14278, 2026.

Spatial statistics provides a principled framework for analyzing environmental variables that exhibit spatial dependence, enabling inference and prediction in systems governed by heterogeneous processes. In many hydrogeological applications, the most informative perspective emerges from fusing complementary datasets, for example, sparse groundwater observations and spatially exhaustive remote sensing products. This data fusion is rarely straightforward because data sources often differ in sampling design, uncertainty, and, crucially, spatial support (the area or footprint represented by a measurement). When observations collected at one support are used to predict at another, the change-of-support problem can induce biased variances and degraded predictions if scale effects are ignored. Here, we integrate groundwater levels from a monitoring network with multi-resolution remote sensing covariates to improve groundwater depth mapping while explicitly accounting for support differences. The study targets groundwater level prediction in Southeast Brazil, where relief compartments and land-use patterns generate strong spatial heterogeneity in recharge and water consumption. We combine in situ groundwater table depths observed at 56 monitoring locations with (i) geomorphological information derived from the 30 m TanDEM‑X dataset and (ii) land-surface water consumption represented by 10 m evapotranspiration estimates from SAFER (Simple Algorithm for Evapotranspiration Retrieving). These covariates encode terrain-driven controls and land-use effects that are not fully captured by point measurements alone. Spatial dependence within and across variables is modeled using the Linear Model of Coregionalization (LMC), enabling coherent estimation of direct and cross-variograms. To ensure consistency across supports, we address support homogenization by regularizing point-support variances and cross-structures to a common block support defined on the prediction grid. This regularized LMC is then used within a collocated block cokriging (CBCK) framework, which applies collocated block covariates to enhance block-scale groundwater predictions. Model performance demonstrates substantial gains from explicitly treating change of support and incorporating multi-resolution covariates. CBCK yields reliable groundwater depth predictions with root mean squared error (RMSE) of 0.41 m, markedly outperforming ordinary block kriging (OBK) estimations (RMSE = 2.89 m) and improving upon prior CBCK implementations that relied on coarser (500 m) evapotranspiration inputs (RMSE = 0.49 m). Beyond accuracy improvements, the resulting maps better reflect the coupling between land-use water demand, terrain-driven controls, and groundwater levels, supporting groundwater management decisions relevant to agronomic planning and ecosystem sustainability. The proposed methodology is transferable to other aquifer systems and can be adapted to alternative remote sensing products and field measurements to explore climate, land use, and hydrogeology interactions across spatial scales.

How to cite: Lilla Manzione, R. and de Oliveira Ferreira Silva, C.: Multi-source data fusion to enhance groundwater levels prediction: merging monitoring networks and orbital remote sensing datasets, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22108, https://doi.org/10.5194/egusphere-egu26-22108, 2026.

EGU26-671 | ECS | Orals | AS3.38

High-resolution measurement-based methane quantification from beef cattle feedlots to improve agricultural GHG inventories 

Sushree Sangita Dash, Trevor W. Coates, and Chandra A. Madramootoo

Methane (CH4) emissions from livestock production remain one of the largest and most uncertain components of national greenhouse gas inventories, largely because direct measurements at operational facilities are limited. This measurement gap constrains the accuracy of agricultural CH4 estimates and the development of effective mitigation strategies. Strengthening the empirical basis for these inventories is therefore essential. Emerging close-range tools, such as uncrewed aerial vehicle (UAV) plume-sampling systems, can enhance monitoring, reporting, and verification (MRV) by providing high-resolution, facility-level observations.

To evaluate this approach, this study conducted a five-day field campaign at a commercial cattle feedlot in southern Alberta, Canada, housing approximately 28,000 cattle. UAV plume sampling was deployed alongside continuous CH4 measurements from an open-path laser (OPL) to estimate CH4 emission rate downwind of the facility. For both techniques, emission rates were derived using inverse dispersion modeling, for a direct comparison of performance and assessing the extent to which UAV-based sampling can complement established ground-based flux measurements.

Uncrewed aerial vehicle-derived CH4 emission rates varied from 149 to 392 g head-1 day-1 (mean ± SE: 280 ± 22), in near-perfect agreement with OPL-derived emissions of 152-438 g head-1 day-1 (280 ± 22). Daily mean emissions differed by only 0.08% during overlapping sampling periods, and statistical distributions were highly consistent across methods. Hour-to-hour variability reflected transient atmospheric dynamics and associated changes in plume dispersion, rather than methodological bias. UAV flights also revealed spatial plume gradients not captured by the fixed OPL geometry, and consistent hourly emission estimates were found when UAV flights collected at least four usable plume samples per hour. Performance declined under very low-wind or highly turbulent conditions, clarifying key operational constraints for future deployments.

Overall, these findings demonstrate that UAV-based plume sampling can provide CH4 emission estimates consistent with established ground-based systems, providing a validated pathway for quantifying emissions from commercial feedlots. The approach aligns with the Integrated Global Greenhouse Gas Information System (IG3IS) good-practice principles and provides empirical information that can improve IPCC Tier 2 emission factors for open-lot beef operations.

How to cite: Dash, S. S., Coates, T. W., and Madramootoo, C. A.: High-resolution measurement-based methane quantification from beef cattle feedlots to improve agricultural GHG inventories, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-671, https://doi.org/10.5194/egusphere-egu26-671, 2026.

EGU26-2527 | ECS | Orals | AS3.38

Investigating Regional Halocarbon Emissions: The Seoul Tracer Release Experiment 

Michelle Jessy Müller, Martin K. Vollmer, Stephan Henne, Jaegeun Yun, Haklim Choi, Sunyoung Park, Lukas Emmenegger, and Stefan Reimann

Hydrofluorocarbons (HFCs) are used as refrigerants, propellants or insulating foams. They don’t deplete the ozone layer like their predecessors, (hydro)chlorofluorocarbons ((H)CFCs). However, HFCs are still potent greenhouse gases and are regulated under the Kyoto Protocol (1997) and, more recently, the Kigali Amendment to the Montreal Protocol. The Kigali amendment targets reductions in HFC production and consumption over the coming decades.1, 2 Observing halogenated substances in the atmosphere provides an independent means to verify compliance with these international treaties. From these observations, regional and global emission estimates can be obtained by combining them with atmospheric modelling or using a reference tracer with known emissions.3, 4 Due to rapid industrialization and high demand for refrigeration and air conditioning, the eastern Asian region contributes significantly to global HFC emissions. Therefore, it is crucial to understand the emission patterns in this region to assess global compliance.

We have conducted a large-scale controlled-release tracer experiment to estimate regional halocarbon emissions of the greater Seoul metropolitan area (South Korea). Ethyl fluoride (HFC-161)5 and hexafluorobutane (HFO-1336mzzE), which are virtually absent in the background atmosphere, were released at one location in the City of Seoul. Release times were selected to align with favorable meteorological conditions that allowed air masses to reach the AGAGE station Gosan (Jeju Island, 490 km south of Seoul). The site is equipped with an instrument for in-situ halocarbon measurements. Intermediately located along the path of air mass transport, sites at the Global Atmosphere Watch (GAW) Observatory Anmyeondo and Mokpo National University (138 km and 320 km from Seoul, respectively) were used for additional flask sampling. The atmospheric transport model FLEXPART6 was used to forecast the tracer plume's trajectory and dispersion, and the release and sampling times were adjusted accordingly.

During two releases in November 2024 and April 2025, both tracers were detected at the flask sampling sites Anmyeondo GAW Observatory and Mokpo National University, as well as at Gosan station. The measurements show a strong correlation of our tracer substances with various HFCs. Preliminary emission estimates for the greater Seoul metropolitan area are derived using the tracer ratio method, and its limitations are discussed. Finally, a comparison to a full regional inversion, based on the continuous observations at Gosan, is conducted.

References

[1] Kyoto Protocol to the United Nations Framework Convention on Climate Change. adopted on December 11th, 1997; Kyoto, 1998, 1-22.

[2] Kigali Amendment to the Montreal Protocol on Substances that Deplete the Ozone Layer. adopted on October 15th, 2016; United Nations, Kigali.

[3] Matt Rigby, Sunyoung Park, Takuya Saito, Luke M. Western, Alison L. Redington, et al., Nature, 2019, 569 (7757), 546-550.

[4] Peter G. Simmonds, Matthew Rigby, Alistair J. Manning, Sunyoung Park, Kieran M. Stanley, et al., Atmospheric Chemistry and Physics 2020, 20 (12), 7271-7290.

[5] Dominique Rust, Martin K. Vollmer, Stephan Henne, Arnoud Frumau, Pim van den Bulk, et al., Nature, 2024, 633, 96-100.

[6] Ignacio Pisso, Espen Sollum, Henrik Grythe, Nina I. Kristiansen, Massimo Cassiani, et al., Geoscientific Model Development, 2019, 12 (12), 4955-4997.

How to cite: Müller, M. J., Vollmer, M. K., Henne, S., Yun, J., Choi, H., Park, S., Emmenegger, L., and Reimann, S.: Investigating Regional Halocarbon Emissions: The Seoul Tracer Release Experiment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2527, https://doi.org/10.5194/egusphere-egu26-2527, 2026.

EGU26-2643 | Orals | AS3.38

Unveiling Carbon Sequestration Dynamics in Bamboo Forests, China: An Observation-Based Approach Using Atmospheric Tracers 

Shuangxi Fang, Oksana Tarasova, Yanxia Li, Jocelyn Turnbull, Yi Lin, Gordon Brailsford, and Sara Mikaloff-Fletcher

Bamboo, a perennial grass species, exhibits rapid growth rates surpassing many native trees, offering substantial potential for atmospheric carbon capture and subsequent sequestration into durable products. Despite this promise, the carbon sequestration capacity of bamboo forests and its variability under different land management practices and environmental conditions remain underexplored. This study examines carbon sequestration in a representative bamboo forest in Anji, eastern China, employing a novel observation-based approach utilizing multiple atmospheric tracers (CO₂, CO, and ¹⁴C-CO₂) measurements to attribute fluxes accurately. The study also includes regular biomass inventory to be able to compare CO2 fluxes between two approaches. Departing from conventional inventory-based estimates of carbon emissions and uptakes, observations-based method yields detailed insights into individual carbon-cycle processes within bamboo ecosystems and identifies the most effective tracers for quantifying regional CO₂ fluxes. Leveraging high-resolution atmospheric CO₂ observations, coupled with advanced modeling systems and analytical tools—including machine learning techniques to reconstruct and correct prior Net Ecosystem Exchange (NEE) fluxes for the bamboo forest—we derive carbon fluxes while accounting for variations in management strategies and environmental factors. These findings enhance our understanding of bamboo's role in global carbon mitigation, informing sustainable forestry practices and climate policy. This work highlights the transformative potential of tracer-based methodologies for precise, scalable carbon flux assessments in managed ecosystems.

The study is supported by the Quadrature Climate Foundation (Grant No. 01-21-000133).

How to cite: Fang, S., Tarasova, O., Li, Y., Turnbull, J., Lin, Y., Brailsford, G., and Mikaloff-Fletcher, S.: Unveiling Carbon Sequestration Dynamics in Bamboo Forests, China: An Observation-Based Approach Using Atmospheric Tracers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2643, https://doi.org/10.5194/egusphere-egu26-2643, 2026.

EGU26-2748 | Orals | AS3.38

Design, operation, and insights from Zurich’s mid- and low-cost ICOS Cities CO2 sensor network 

Lukas Emmenegger, Luce Creman, Andrea Fischer, Stuart K. Grange, Christoph Hüglin, Pascal Rubli, and Dominik Brunner

Zurich aims for net-zero direct greenhouse gas emissions by 2040, a target supported by 75 % of voters. Progress is tracked through a detailed CO2 inventory covering energy, transport, industry, and waste. Under the European ICOS Cities project, a monitoring program was launched using two approaches: (i) a network of mid- and low-cost CO2 sensors combined with atmospheric inverse modeling, and (ii) CO2 flux measurements from an eddy-covariance system on a city-center high-rise building, paired with footprint modeling.

Here, we focus on the mid-cost (ZiCOS-M) and low-cost (ZiCOS-L) NDIR (nondispersive infrared) CO2 networks, which were both operational for at least 3 years since 2022.

ZiCOS-M consists of 26 monitoring sites, 21 in the city and 5 outside the urban area. Daily calibrations using two reference gas cylinders, and corrections of the sensors’ spectroscopic response to water vapour were performed. The hourly mean root mean squared error (RMSE) was 0.98 ppm (0.46 - 1.5 ppm) and the mean bias ranged between 0.72 and 0.66 ppm compared to parallel measurements with a high-precision reference gas analyser for a period of 2 weeks or more. CO2 concentrations in the city were highly variable with site means ranging from 434 to 460 ppm, and Zurich’s mean urban CO2 increment was 15.4 ppm above the regional background.

ZiCOS-L consists of 56 sites with paired sensors. The sensors require in-field training for model calibration before deployment and further post-processing steps to account for drift and outliers. After data processing, the hourly RMSE was 13.6±1.4 ppm, and the mean bias 0.75±1.67 ppm when validated against parallel reference measurements from ZiCOS-M. CO2 concentrations were highly variable with site means in Zurich ranging from 438 to 465 ppm, reflecting mainly the influence of sources in the nearby surroundings. Vegetation (mainly grassland) amplified the morning concentration on average in summer by up to 20 ppm due to ecosystem respiration, while heavy traffic increased the morning rush hour concentration by 15 ppm. Despite its lower measurement accuracy, the ZiCOS-L network enables the study of concentration dynamics at a spatial and temporal scale that is not accessible by any other means.

The ZiCOS-M data was extensively used to derive top-down CO2 emissions. Similar modelling activities are currently ongoing with the ZiCOS-L data, and both are compared to emissions derived from the eddy covariance system and to the city's emission inventory.

 

Grange SK, … Emmenegger L, The ZiCOS-M CO2 sensor network: measurement performance and CO2 variability across Zurich. https://doi.org/10.5194/acp-25-2781-2025.

Creman L, … Bernet L, The Zurich Low-cost CO2 sensor network (ZiCOS-L): data processing, performance assessment and analysis of spatial and temporal CO2 dynamics. https://doi.org/10.5194/egusphere-2025-3425

Brunner D, … Emmenegger L, Building-resolving simulations of anthropogenic and biospheric CO2 in the city of Zurich with GRAMM/GRAL. https://doi.org/10.5194/acp-25-14279-2025.

Hilland R, … Christen A, Sectoral attribution of greenhouse gas and pollutant emissions using multi-species eddy covariance on a tall tower in Zurich, Switzerland. https://doi.org/10.5194/acp-25-14279-2025.

Ponomarev N, … Brunner D, Estimation of CO2 fluxes in the cities of Zurich and Paris using the ICON-ART CTDAS inverse modelling framework. https://doi.org/10.5194/egusphere-2025-3668.

How to cite: Emmenegger, L., Creman, L., Fischer, A., Grange, S. K., Hüglin, C., Rubli, P., and Brunner, D.: Design, operation, and insights from Zurich’s mid- and low-cost ICOS Cities CO2 sensor network, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2748, https://doi.org/10.5194/egusphere-egu26-2748, 2026.

EGU26-2817 | ECS | Orals | AS3.38

Concurrent data assimilation of methane concentrations and fluxes  

Niklas Becker, Niels Heinrich Keil, Valentin Bruch, and Andrea Kaiser-Weiss

We use atmospheric inverse modelling to provide observation-based estimates of methane emissions at the national scale in Europe. We apply the numerical weather prediction model ICON-ART to obtain an ensemble of methane concentrations by varying the meteorology, the lateral boundary conditions and emission fields. By comparing to ground based observations of the ICOS network, we employ a 4D LETKF to assimilate both the concentrations and emissions concurrently. We create an ensemble of emissions in two ways: We can perturb the underlying emission field with a gaussian random field, or we can separate it into regions and economic sectors and scale these. We compare the two approaches and the resulting emission estimates to national greenhouse gas inventories and synthesis inversion results with a focus on Germany. The first results are presented for 2021 and we identify a considerable mismatch with the reported emissions in central Europe.

How to cite: Becker, N., Keil, N. H., Bruch, V., and Kaiser-Weiss, A.: Concurrent data assimilation of methane concentrations and fluxes , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2817, https://doi.org/10.5194/egusphere-egu26-2817, 2026.

EGU26-3366 | Posters on site | AS3.38

Characteristics of CO2 and CH4 from different emission sources using mobile measurements and stable carbon isotope analysis 

Hyeongseok Choi, Jongbyeok Jun, Sunran Lee, Sumin Kim, and Yongjoo Choi

Achieving effective greenhouse gases (GHGs) mitigation policy requires accurate quantification of contribution from each emission source based on in-situ measurements. In this study, we investigated the spatial distribution of CO2 and CH4 emitted from different emission sources by conducting mobile measurements using a GLA331-GGA analyzer (ABB–LGR Inc.) mounted on a vehicle. We conducted seven mobile measurements in spring (N = 3), summer (N = 2), and fall (N = 2) over Seoul Metropolitan Area (SMA) in 2025. By comparing the correlation between two GHGs from various emission sources, we selected representative sites including livestock facilities (cattle and swine barns), industrial complexes, urban, wastewater treatment plants, LNG power plants, rural areas. Background GHGs concentrations were defined as the daily 5th percentile for each measurement day, and correlations between enhancements (ΔCO2 and ΔCH4) were assessed. Along with real time measurements, stable carbon isotopes samplings were also conducted to provide complementary constraints on concentration variability and the contributions of end-member of each emission source. For stable isotope measurements, two ambient air samples were collected per site using canisters (Entech, Simi Valley, CA, USA) and analyzed with Picarro G2131-i for δ13C–CO213C) and Picarro G2132-i for δ13C–CH413CH4). Strong co-variability between the two GHGs was observed at several emission sources and seasons, including springtime cattle barns (R = 0.75), LNG power plants (R = 0.83), industrial complexes (R = 0.74), and swine barns (R = 0.64); summertime cattle barns (R = 0.66) and LNG power plants (R = 0.67); and fall industrial complexes (R = 0.70) and cattle barns (R = 0.97). These correlations suggested that CO2 and CH4 were likely emitted concurrently from shared sources or similar emission activities in SMA region. The observed δ13C values ranged from −8.2‰ to −12.5‰, while δ13CH4 ranged from −47.2‰ to −48.6‰. Seasonal mean δ13C values were −11.2‰ in spring, −9.2‰ in summer, and −10.1‰ in fall, consistent with a summertime influence from enhanced biospheric respiration, with the most depleted values occurring in spring. In contrast, δ13CH4 exhibited relatively small seasonal variability, as indicated by the coefficient of variation (sd/mean; 0.004 in spring, 0.013 in summer, and 0.012 in fall), but still provided useful constraints on source attribution. In addition, a Bayesian isotope mixing model (the ‘simmr’ package in R) was applied to quantify relative source contributions indicating that coal combustion contributed most strongly to δ13C, whereas wastewater treatment and natural gas were the dominant contributors to δ13CH4.

How to cite: Choi, H., Jun, J., Lee, S., Kim, S., and Choi, Y.: Characteristics of CO2 and CH4 from different emission sources using mobile measurements and stable carbon isotope analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3366, https://doi.org/10.5194/egusphere-egu26-3366, 2026.

EGU26-3437 | ECS | Orals | AS3.38

Development of an Ensemble-Based Data-Assimilation System for CO2 Fluxes Using ICON-ART 

Jakob Böttcher, Niklas Becker, Andrea Kaiser-Weiss, and Maya Harms

Observation based quantification of surface CO2 fluxes relies on the consistent integration of atmospheric observations with numerical transport models. We present the development and demonstration of an ensemble-based data assimilation system that couples atmospheric CO2 observations to the ICON-ART modeling framework using a Local Ensemble Transform Kalman Filter (LETKF).

 

Starting with a flux estimate provided by CarbonTracker Europe High-Resolution we start with a dynamic model with hourly resolution with a focus on fluxes in Europe for 2021. We then create an ensemble of perturbed prior fluxes within assumed uncertainties using prescribed spatial and temporal correlation structures. We simulate the transport of these ensemble members in ICON-ART in limited area mode, while varying the meteorological conditions to represent meteorological uncertainties. Subsequently, we use the LETKF to update the state vector of concentrations and CO2 fluxes daily, resulting in an posterior estimate of surface CO2 fluxes over Europe. 

 

This work provides the foundation for an ICON-ART-based CO2 flux assimilation system and establishes a technical basis for future extensions toward longer assimilation periods, refined error modeling, and the assimilation of anthropogenic emission signals.

How to cite: Böttcher, J., Becker, N., Kaiser-Weiss, A., and Harms, M.: Development of an Ensemble-Based Data-Assimilation System for CO2 Fluxes Using ICON-ART, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3437, https://doi.org/10.5194/egusphere-egu26-3437, 2026.

In this research, we propose a simple and effective method for gas analysis of semiconductor and display industries. To achieve this, residual gas analyzer (RGA) was adopted and two high-global warming potential (GWP) gases such as CF4 and NF3 commonly used in industrial application were focused. The experiment was conducted in four key steps: identifying gas species using optical emission spectroscopy (OES), calibrating RGA with a quadrupole mass spectrometer (QMS), constructing a five-point calibration graph to correlate RGA and Fourier-transform infrared spectroscopy (FT-IR) data, and estimating the concentration of unknown samples using the calibration graph. The results under plasma-on conditions demonstrated correlation and accuracy, confirming the reliability of our approach. In other words, the method effectively captured the relationship between RGA intensity and gas concentration, providing valuable insights into concentration trends. Thus, our approach serves as a useful tool for estimating gas concentrations and understanding the correlation between RGA intensity and gas composition.

 

Reference

[1] B. G. Jeong, S. H. Park, D. H. Goh, and B. J. Lee, Metrology 5 (2025) 60

How to cite: Jeong, B. G.: Real-Time Monitoring and Quantification of Fluorinated Greenhouse Gases in Semiconductor/Display Manufacturing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4566, https://doi.org/10.5194/egusphere-egu26-4566, 2026.

The semiconductor and display industries are significant sources of fluorinated greenhouse gas (F-GHG) emissions in the electronics, making accurate emission estimation essential for addressing climate change. The Republic of Korea, a leading country in the semiconductor and display industries, requires precise evaluation of the environmental impact of these industries due to its global competitiveness. Currently, The Republic of Korea relies on default emission factors provided by the 2006 IPCC guidelines for estimating F-GHG emissions. However, this approach does not account for the latest mitigation technologies implemented in Republic of Korea, resulting in a conservative overestimation of actual F-GHG emissions. To address this issue, this study conducted direct measurements of F-GHG emissions from semiconductor manufacturing processes in facilities equipped with advanced mitigation technologies. By employing state-of-the-art measurement methods, the study evaluated the use rate of gas (Ui) and generation rate of by-product gas (Bbyproduct, Bi) and compared the results with the default emission factors provided by IPCC G/L (2006 and 2019). Moreover, based on derived country-specific emission factors (Tier 3b), GHG emissions were estimated and compared with tier-based methodologies using 2006 and 2019 IPCC G/L default factors (Tier 2a, 2b, 2c and 3a). The finding highlights the need for developing country-specific emission factors and contribute to the establishment of precise, data-driven policies for reducing GHG emissions in Republic of Korea’s electronics industry. Furthermore, this research serves as valuable reference for other countries aiming to refine their emission estimates with country-specific data and technological advancements, ultimately contributing to global efforts towards carbon neutrality.

How to cite: Inkwon, J. and Bong-Jae, L.: Comparative Analysis of F-GHGs Emission Estimates between IPCC Default Factors and Measurement-based Korea-specific Emission Factors in Semiconductor Manufacturing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4570, https://doi.org/10.5194/egusphere-egu26-4570, 2026.

EGU26-5198 | ECS | Orals | AS3.38

Monitoring urban atmospheric CO2 plumes from space: sensitivity to urban physics and scale effects over Paris 

Alohotsy Rafalimanana, Thomas Lauvaux, Charbel Abdallah, Mali Chariot, Michel Ramonet, Josselin Doc, Olivier Laurent, Morgan Lopez, Anja Raznjevic, Maarten Krol, Leena Järvi, Leslie David, Olivier Sanchez, Andreas Christen, Dana Looschelders, Laura Bignotti, Benjamin Loubet, Sue Grimmond, and William Morrison

Quantifying urban CO2 emissions from space can be approached using different methodologies, including direct plume-based analyses, but combining satellite observations with atmospheric transport models requires the ability to realistically reproduce fine-scale spatial gradients over cities. Using the Grand Paris area as a testbed, we investigate the sensitivity of simulated near-surface CO2 concentrations to urban physics parameterization and horizontal resolution within the WRF-Chem modeling framework coupled to a high-resolution fossil fuel emission inventory. At mesoscale resolution (900 m), a hierarchy of urban representations ranging from simulations without urban physics to multi-layer urban canopy models is evaluated, showing that the Building Energy Model (BEM) provides the most physically consistent simulation of surface energy fluxes, boundary-layer development, and near-surface CO2 variability. Building on this configuration, we compare mesoscale simulations with Large-Eddy Simulation (LES) runs at 300 m and 100 m resolution. Model results are evaluated against dense urban CO2 observations from the high-precision Picarro network, a complementary mid-cost sensor network from ICOS-Cities, and surface sensible and latent heat flux observations from the ICOS ETC Level-2 fluxes data product. An extensive urban observation network including wind lidars and ceilometers from Urbisphere project provides an exceptional constraint for the evaluation of boundary-layer structure and vertical mixing at fine scales. The LES simulations substantially enhance the representation of spatial heterogeneity and localized CO2 enhancements associated with major emission sources, which are smoothed or underestimated at mesoscale resolution. However, increased resolution also amplifies sensitivity to local wind fields and emission inventory uncertainties. These results highlight that both urban physics and model resolution critically shape the ability of transport models to reproduce observed urban CO2 gradients.

How to cite: Rafalimanana, A., Lauvaux, T., Abdallah, C., Chariot, M., Ramonet, M., Doc, J., Laurent, O., Lopez, M., Raznjevic, A., Krol, M., Järvi, L., David, L., Sanchez, O., Christen, A., Looschelders, D., Bignotti, L., Loubet, B., Grimmond, S., and Morrison, W.: Monitoring urban atmospheric CO2 plumes from space: sensitivity to urban physics and scale effects over Paris, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5198, https://doi.org/10.5194/egusphere-egu26-5198, 2026.

EGU26-5426 | ECS | Orals | AS3.38

Quantifying Agricultural Methane Emissions Using Satellite Observations 

Mengyao Liu, Ronald van der A, Michiel van Weele, Elefttherios Ioannidis, Ruoqi Liu, Zichong Chen, and Jieying Ding

Methane (CH₄) is the second most important greenhouse gas after CO₂, and its emissions from the agricultural sector, particularly rice paddies and dairy farms, remain highly uncertain and challenging to quantify. While recent advancements in satellite technology, such as high spatial resolution instruments, have enabled the detection of methane sources from global to facility scales, agricultural emissions still pose challenges. These emissions are typically diffuse and area-like, making them less detectable by targeted satellites like GHGSat and EMIT, which are better suited for isolated point sources such as oil/gas facilities or landfills. Additionally, agricultural emissions exhibit significant spatiotemporal variability driven by climate conditions, water management practices in rice paddies, and differences in farm types.

In the AGATE project of ESA, we apply an improved divergence method to estimate monthly methane emissions using TROPOspheric Monitoring Instrument (TROPOMI) satellite observations at a 0.1° grid resolution. We focus on major agricultural regions, including the Po Valley in Italy, as well as India and Bangladesh, over the period 2019-2022. To better isolate agricultural emissions, we separate area-like sources (e.g., rice paddies) from isolated point sources. The locations of identified big emitters are cross-validated using bottom-up emission inventories and targeted satellite observations (e.g., EMIT, Carbon Mapper) to minimize the influence of non-agricultural sources. Furthermore, to better understand the seasonality of methane emissions, we analyze the correlations between methane emission variations and auxiliary datasets, such as rice paddy maps and ammonia emissions derived from satellites.

How to cite: Liu, M., van der A, R., van Weele, M., Ioannidis, E., Liu, R., Chen, Z., and Ding, J.: Quantifying Agricultural Methane Emissions Using Satellite Observations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5426, https://doi.org/10.5194/egusphere-egu26-5426, 2026.

EGU26-5912 | ECS | Posters on site | AS3.38

Investigating Germany’s progress in decoupling air pollution emissions from economic activity using satellite-based measurements of NO₂. 

Erika Remy, Rosina Engert, Laurenz Werner, and Michael Bittner

In efforts to mitigate the effects of global climate change several prominent policies and guidelines which emphasize the importance of sustainable growth have been introduced in recent years. Examples include the 2019 European Green Deal, and the subsequent Clean Industrial Deal in 2025. A key aspect of these goals is the reduction of air pollutant emissions, particularly from fossil fuel combustion, without sacrificing economic growth. The Green Deal commits to an EU wide emission reduction of at least 55% by 2030, as compared to 1990 levels. Remote sensing offers many advantages for tracking progress towards reduction of pollutant emissions. In particular, the global coverage allows for analysis of regions which do not have sufficient ground-based measurement networks. This study presents a method of using spectral analysis with tropospheric NO2 column density and the gross domestic product (GDP) to track and compare progress of the German federal states towards decoupling emissions from economic growth. Most studies evaluating economic decoupling focus on CO2, or CO2 equivalences. There is a current lack of studies which investigate other key combustion products. This study focuses on NO2 as a proxy for emissions related to economic activity. NO2 originates primarily from anthropogenic combustion sources, andhas a short tropospheric lifetime, making it suitable to represent localized fossil fuel emissions.  Measurements of NO2 used in this study are obtained from the Ozone Monitoring Instrument (OMI) launched aboard the NASA Aura satellite in 2004. The application of spectral analysis techniques, such as the wavelet analysis, gives additional insight into temporal variability of NO2, to better observe the path of decoupling for each region. Decoupling between GDP and NO2 variability is observed for all regions of Germany in the period between the two most recent global economic recessions (the 2008 financial crisis, and the Covid-19 pandemic). Similar decreasing trends are observed for both the yearly average tropospheric column density and the calculated yearly variability. The variability obtained from the wavelet analysis shows greater sensitivity to changes in NO2 emissions than the absolute tropospheric column density. Further regional differences such as the main economic sectors and types of emission regulations in place are discussed to contextualize the differences present in decoupling processes between the federal states. Overall, NO2 variability is found to be a sensitive and effective indicator for tracking and comparing decoupling progress across different administrative regions.

How to cite: Remy, E., Engert, R., Werner, L., and Bittner, M.: Investigating Germany’s progress in decoupling air pollution emissions from economic activity using satellite-based measurements of NO₂., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5912, https://doi.org/10.5194/egusphere-egu26-5912, 2026.

EGU26-7798 | ECS | Posters on site | AS3.38

Low latency and high resolution GHG emission estimates to support monitoring and modelling activities in Spain 

Oliver Legarreta, Paula Castesana, Ivan Lombardich, Carles Tena, Carmen Piñero-Megías, Artur Viñas, Johanna Gehlen, Luca Rizza, Carlos Pérez García-Pando, and Marc Guevara Vilardel

Reliable and timely information on greenhouse gas (GHG) emissions is essential for evaluating mitigation policies and supporting data assimilation and verification modelling frameworks. In this contribution, we present the sPanisH EmissioN mOnitoring systeM for grEeNhouse gAses (PHENOMENA), a low-latency GHG modelling framework developed within the RESPIRE-CLIMATE Spanish national project, which received formal endorsement from the WMO-IG3IS initiative.

PHENOMENA provides harmonised daily and high spatial resolution (up to 1 km × 1 km) CO2 and CH4 emissions for the main combustion-related sectors, including electricity generation, manufacturing industry (cement and iron and steel), residential and commercial combustion, road transport, shipping and aviation. The system estimates CO2 and CH4 emissions by combining low latency activity data and fuel- and process-dependent emission factors through bottom-up and downscaling approaches. The data collected and pre-processed includes hourly near-real-time traffic counts from the national road network, hourly electricity production data reported by individual power plants, daily Copernicus ERA5-Land surface temperature, monthly industrial production statistics and AIS (Automatic Identification System) data, among others.

PHENOMENA produces multiple GHG emission products, including high resolution maps of daily emissions per sector, as well as daily summaries of emissions aggregated at different regional levels and for the main Spanish metropolitan regions. The emissions computed with PHENOMENA allows representing the intra-weekly and seasonal variability of GHG emissions as well as changes in their spatial patterns, which can be linked to specific policy, socioeconomic, and weather impacts.

The results produced with PHENOMENA are compared to official GHG emission inventories as well as to other state-of-the-art low latency GHG emission datasets, such as the ones produced by the CAMS Carbon Monitor initiative. Overall, these developments demonstrate the capability of PHENOMENA to deliver consistent, multisector and near-real-time GHG emission estimates, supporting national monitoring, policy evaluation and future verification and data-assimilation efforts.

How to cite: Legarreta, O., Castesana, P., Lombardich, I., Tena, C., Piñero-Megías, C., Viñas, A., Gehlen, J., Rizza, L., Pérez García-Pando, C., and Guevara Vilardel, M.: Low latency and high resolution GHG emission estimates to support monitoring and modelling activities in Spain, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7798, https://doi.org/10.5194/egusphere-egu26-7798, 2026.

EGU26-7927 | Orals | AS3.38

Using point source imaging satellite observations to guide landfill methane model improvements at the national and sub-national scale 

Tia Scarpelli, Daniel Cusworth, Jinsol Kim, Kelly O'Neill, Riley Duren, and Katherine Howell

As national and sub-national governments, companies, and communities plan methane mitigation action, there is a need for robust emissions tracking systems, especially for major sectors like waste where countries have made commitments to reduce emissions. Landfills are a major source of methane emissions in many jurisdictions spread across the world, so there is a need in the waste sector for monitoring frameworks that are applicable at scale but also provide facility-level insights to guide decision making. 

 

Given the complexity of landfill emissions both in terms of variability and underlying causes, models are a common tool used for planning and tracking landfill methane mitigation, but past studies show potential biases in models and inventories compared to observations. In this work, we bring together both process-level insights as provided in bottom-up models and our top-down observations from the Tanager-1 satellite by (1) improving the accuracy and consistency of satellite-derived annual average emission rates and (2) developing methodologies for reconciling the two unique datasets. The goal of this work is to use satellite methane observations to identify improved bottom-up model parameters, focusing on the modeling frameworks used by national and sub-national jurisdictions.

 

As a point source imaging satellite, Tanager-1 is well suited for tracking emissions at landfills as it provides facility-scale methane emissions data, but existing algorithms and workflows for creating the emissions data have been primarily validated based on controlled release experiments which mimic environments more similar to the oil and gas sector than landfills. We identify methods that are robust and best suited to landfills by performing sensitivity tests for our quantification methods, testing algorithms and parameters, and identifying causes of bias unique to landfill environments (e.g., albedo, topography). The next step is translating our Tanager-1 observations to annual averages. We present a new methodology for temporally averaging satellite observations that accounts for null detects through scene-specific probability of detection limits. Finally, we compare our annual average satellite-based emission estimates to bottom-up models typically used by jurisdictions for official reporting (e.g., IPCC, LandGEM, US GHGRP), focusing on select countries where there is sufficient spatiotemporal coverage with Tanager-1. We use statistical methods to adjust parameters in the bottom-up models to reconcile the model estimates with observed emissions, allowing region-specific model parameter adjustments to account for potential climatic and meteorological factors. Finally, we discuss the implications of our initial results in terms of improvements to official national reporting and compare to inverse modeling results.

How to cite: Scarpelli, T., Cusworth, D., Kim, J., O'Neill, K., Duren, R., and Howell, K.: Using point source imaging satellite observations to guide landfill methane model improvements at the national and sub-national scale, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7927, https://doi.org/10.5194/egusphere-egu26-7927, 2026.

EGU26-8459 | Posters on site | AS3.38

Validating environmental reporting of carbon emissions 

Lee Stokes, Aleksandra Przydrozna, and Valerie Livina

ESG (Environmental, Social, Governance) reporting is essential for industry as it helps secure investment for companies’ development. While Scope 1 are direct emissions and Scope 2 are indirect emissions, most of the industrial players report Scope 2 emissions from the use of energy (electricity and gas): these are carbon emissions that are emitted in the power station that uses fossil fuels (oil, coal, gas, biomass, etc.), see [1].

Conventional way to report company’s carbon emissions of Scope 2 is to obtain electrical meter readings and multiply them by the average carbon intensity of the electric grid that supplies electricity. In the UK, such carbon factors were previously published (annually) by the Department for Environment, Food, and Rural Affairs (Defra), then more recently by the Department for Energy Security and Net Zero (DESNZ). These average annual factors are approximate, and actual fuel mix of the electrical grid varies within a few minutes, depending on the operating power generators.

In some cases, the annual carbon intensity may underestimate the actual intensity of the grid. This usually happens in Europe in winter, when a large number of gas-fuelled generators are active to provide sufficient heating, and at the same time wind conditions are placid, providing little of renewable energy. In other cases, when there is lots of wind-generated energy and less gas-generated energy (for example, on a windy summer day), the average carbon factor may overestimate actual carbon intensity of the grid.

In several case studies, we demonstrate that such discrepancies may reach 10-15% of the total carbon emissions, as they are presented in quarterly or annual ESG reports. The results suggest that the current way of reporting carbon emissions should be revised, so that actual state of the dynamical energy grid would be taken into account for improvement of ESG reporting. Subsequently, this will impact their ESG standing and potential investment, which is crucial for European business as well as for the correct accounting of the impact of European carbon emissions [2].

References

[1] Livina et al, International Journal of Metrology and Quality Engineering, in revision.

[2] Livina et al, in preparation.

How to cite: Stokes, L., Przydrozna, A., and Livina, V.: Validating environmental reporting of carbon emissions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8459, https://doi.org/10.5194/egusphere-egu26-8459, 2026.

Reducing methane (CH4​) emissions through environmentally friendly agriculture, such as Alternate Wetting and Drying (AWD), is a critical strategy for climate change mitigation in rice production. To effectively implement and evaluate these mitigation measures, it is essential to monitor agricultural practices and environmental variables at a high spatial resolution. This study develops a standardized data-processing protocol, which leverages Google Earth Engine (GEE) to generate high-resolution remote sensing features necessary for quantifying CH4​ emissions.

The protocol integrates multi-sensor satellite data to capture the spatio-temporal dynamics of sustainable rice farming. Central to this protocol is the use of Sentinel-1 Synthetic Aperture Radar (SAR) data to classify water management regimes, specifically distinguishing between continuous flooding (CF) and AWD at the pixel level. Additionally, Sentinel-2 optical imagery is processed to extract key vegetation indices (e.g., NDVI, GRVI) to monitor crop growth. To address environmental factors, coarse-resolution soil moisture data from SMAP is downscaled to resolution by incorporating Sentinel-2 and Digital Elevation Model (DEM) data.

By synthesizing these multi-sensor inputs, the protocol provides the necessary foundation for mapping methane emission hotspots and assessing the impact of environmentally friendly management practices. This high-resolution approach supports the design of region-specific mitigation strategies and the advancement of climate-smart agriculture.

As for future research plans, we will apply the constructed model with the field-measured validation data to the extensive rice paddies in southern Ibaraki Prefecture in Japan to estimate methane emissions on a pixel-by-pixel basis and create hotspot maps. This enables the upscaling of a single-point observation model to a broader area while reflecting regional characteristics. This methodology is expected to serve as a powerful tool for examining highly effective methane reduction measures (such as utilization under the J-Credit system) based on each region's agricultural practices and environmental conditions.

How to cite: Shoyama, K., Hirai, C., and Den, H.: Monitoring Environmentally Friendly Agriculture for Methane Emission Reduction: A High-Resolution Multi-Sensor Remote Sensing Protocol on Google Earth Engine, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8591, https://doi.org/10.5194/egusphere-egu26-8591, 2026.

EGU26-9177 | Posters on site | AS3.38

Transport model error in inverse modelling: Developments within the ITMS project 

Christoph Gerbig, Michal Galkowski, Frank-Thomas Koch, Lena Danyeli, Fabian Maier, Saqr Munassar, Yang Xu, and Christian Rödenbeck

Inverse modelling of CO2 and CH4 using atmospheric in-situ data relies on simulations of atmospheric transport that arederived from models used in numerical weather prediction. The relevant time scales for inversions range from hours to decades, which is far beyond the time scales of a few weeks for which NWP models are designed. The strong diurnal and seasonal variations in surface to atmosphere fluxes of CO2 covary with atmospheric mixing in the boundary layer, as both are solar radiation driven. This way slight seasonal or diurnal biases in the representation of mixing can be amplified. In addition, different atmospheric models show differences in vertical mixing through turbulent mixing and through moist convection, and thus in the representation of vertical gradients in tracers, which results strong differences in flux estimates from inverse modelling. These facts have been known since several decades by now, but progress in addressing these issues has been slow. Within the atmospheric network of ICOS (Integrated Carbon Observation System) additional meteorological observations are available that provide information on atmospheric mixing heights. Also, IAGOS (In-service Aircraft for a Global Observing System) provides information on vertical gradients which can be related to mixing through turbulence and convection.

ITMS, the Integrated Greenhouse gas Monitoring system for Germany, is implemented in multiple development phases: a first phase with the development of a demonstrator system, followed by the second phase, the development of a first-generation system, and a third and last phase, the transfer to operations. With each phase lasting about four years, the project provides a medium-term framework that allows also addressing some of the longer lasting problems such as transport uncertainty. Within ITMS the CarboScope Regional inversion system (CSR) is used as a reference system for CO2 and CH4 inversions, but also as a testbed for model developments. The presentation will provide an overview of recent results obtained within ITMS. This includes evaluating vertical mixing by using additional meteorological profile data or mixing height information, using additional tracers in inversions such as Radon, and confronting vertical profiles from airborne observations with model equivalents. 

How to cite: Gerbig, C., Galkowski, M., Koch, F.-T., Danyeli, L., Maier, F., Munassar, S., Xu, Y., and Rödenbeck, C.: Transport model error in inverse modelling: Developments within the ITMS project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9177, https://doi.org/10.5194/egusphere-egu26-9177, 2026.

EGU26-9574 | ECS | Orals | AS3.38

A global coal mine methane tracker to highlight inventory gaps and target mitigation 

Rebekah Horner, Sabina Assan, and Adomas Liepa

Methane (CH4) is a key short-lived climate forcer, yet robust monitoring of its anthropogenic sources remains limited by inconsistent national reporting and incomplete inventories, especially from coal mining. Global anthropogenic CH4 emissions are about 369 million tonnes per year, of which coal mine methane (CMM) contributes roughly 40 million tonnes per year, which is comparable to emissions from the gas sector. In 2023 only 15% of coal production reported annual CMM emissions in national greenhouse gas inventories and this limits the scientific basis for monitoring and verification of progress towards the Global Methane Pledge and the Paris Climate Agreement.

We present Ember’s Coal Mine Methane Data Tracker as a new open, global, evidence based dataset for understanding CMM emissions, reporting quality and methane targets. The Data Tracker compiles and harmonises national greenhouse gas inventory submissions to the United Nations Framework Convention on Climate Change (UNFCCC). It integrates these data with historic coal production statistics from the US Energy Information Administration (EIA), International Energy Agency (IEA) coal production forecasts and independent emission estimates (IEA Methane Tracker, Global Energy Monitor (GEM) Global Coal Mine Tracker).

To reconstruct national emissions from 1990 onwards, we calculate country and year specific CH4 emission intensities wherever both reported emissions and coal production exist. Emission intensity is defined as CH4 emissions (in kilotonnes) per million tonnes of coal produced. This approach also enables consistent comparison of reported emissions across countries and over time.

We fill gaps in the intensity time series using values from neighbouring years so that each country has a continuous record. We then multiply these completed intensity series by observed production to estimate unreported emissions. Ember’s gap filled series indicates that global active CMM emissions exceeded 34 million tonnes in 2023, whereas official UNFCCC inventories reported only 4.62 million tonnes, less than 14% of the inferred total. For 2024, the latest compilation of submissions implies 34.5 million tonnes of reported CMM, with underreporting of up to 21.2 million tonnes when compared with independent datasets.

We introduce a quantitative confidence score from 0 to 6 for each country’s reported CMM emissions, combining recency of UNFCCC reporting, consistency with independent estimates from both top down and bottom up approaches, and methodological robustness. Applied to major producers, this score shows that most large coal producing countries fall in the low-to-moderate confidence range, with only a small number, such as Poland (score 5), achieving higher confidence in their reported CMM inventories. 

By providing a transparent, harmonised framework for CMM monitoring, we demonstrate that systematic underreporting pervades national inventories. This gap is driven by widespread reliance on low tier IPCC methods, with 86% of reported CMM emissions relying on emission factors rather than direct measurement. Our quantitative confidence score (ranging from 0 to 6) highlights this reliance, showing that low scoring countries correlate directly with significant underestimation. This evidence necessitates the need for transparent, measurement based Monitoring, Reporting and Verification (MRV) frameworks to establish the rigorous CH4 accounting required by global climate commitments.

How to cite: Horner, R., Assan, S., and Liepa, A.: A global coal mine methane tracker to highlight inventory gaps and target mitigation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9574, https://doi.org/10.5194/egusphere-egu26-9574, 2026.

EGU26-10112 | ECS | Posters on site | AS3.38

Urban greenhouse gas monitoring across the Barcelona Metropolitan Area 

Vanessa Monteiro, Gara Villalba Mendez, Qing Luo, and Roger Curcoll Masanes

An urban greenhouse gas (GHG) monitoring network has been established in the Barcelona Metropolitan Area to support the evaluation of GHG mitigation strategies. The network currently consists of five measurement sites equipped with high-precision Picarro analysers providing continuous observations of carbon dioxide (CO2) and methane (CH4). These measurements, in combination with atmospheric modelling will be used to investigate spatial and temporal variability in urban GHG concentrations.

The five sites (Fabra, ICM, ICTA, IDAEA, and UPC-Agropolis) were strategically selected to represent a range of urban and peri-urban environments, including a natural forest, an urban coastal site, a traffic-influenced highway location on the outskirts of the city, an urban park embedded within a densely built area, and a peri-urban agricultural region. This configuration enables the assessment of how different landuse types and emission sources influence observed GHG mole fractions across the metropolitan area.

Hourly averaged CO2 mole fractions show pronounced differences between sites. Lower values are observed at the forested Fabra site, while the ICTA site, located near a major highway, exhibits the highest mole fractions and the largest variability. These spatial contrasts are consistent with results from previous multi-site measurement campaigns in Barcelona, which indicated that densely urbanized, impermeable landscapes are associated with enhanced CO2 concentrations compared to greener areas, particularly during morning hours dominated by traffic emissions.

Maintaining a continuous urban monitoring network is essential for capturing both spatial and temporal variability in GHG concentrations and for improving our understanding of urban atmospheric processes. Such observations are also critical for constraining and validating atmospheric models and for quantifying changes in emissions over time. Here, we present recent observations from the Barcelona Metropolitan Area GHG network and illustrate their application to the study of greenhouse gas variability in complex urban environments.

How to cite: Monteiro, V., Villalba Mendez, G., Luo, Q., and Curcoll Masanes, R.: Urban greenhouse gas monitoring across the Barcelona Metropolitan Area, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10112, https://doi.org/10.5194/egusphere-egu26-10112, 2026.

EGU26-10768 | ECS | Posters on site | AS3.38

Forward modelling of SF6 with ICON-ART 

Maya Harms, Katharina Meixner, Tanja Schuck, Thomas Wagenhäuser, Sascha Alber, Kieran Stanley, Andreas Engel, Valentin Bruch, Thomas Rösch, Martin Steil, and Andrea Kaiser-Weiss

Sulfur hexafluoride (SF6) is a highly potent greenhouse gas (GHG). Despite its high global warming potential (GWP), it continues to be produced and used in Germany. The reported emission estimates can be used to calculate expected concentrations at measurements sites. Within the PARIS (Process Attribution of Regional Emissions) project we used the operational numerical weather prediction model ICON (ICOsahedral Nonhydrostatic) and its extension module for aerosol and trace gases (ART) as an Eulerian forward model to calculate the expected mixing concentrations response of Germany's largest point source of SF6. We compared the modelled concentration peaks that occur when the modelled plume crosses the measurement site of the Taunus observatory (TOB) with the respective observed signals (requiring background subtraction). The 4-year period of 2020-2023 was covered, and the uncertainty of the meteorological transport was estimated using a 20-member ensemble in our limited area model for Europe, which was run with a horizontal grid resolution of 6.5 km and 74 vertical levels.The model predicts well when peaks are measured but weWe found that most observed peaks at TOB are considerably higher than in the model, suggesting that prior emissions estimates were too low. 
This indicates that the independent, observation-based emission estimate of our ICON-ART based system is in the range of double-digit tons, which is considerably higher than the self-reported SF6 emission estimate for this point source, also if the model uncertainties are taken into account. 

How to cite: Harms, M., Meixner, K., Schuck, T., Wagenhäuser, T., Alber, S., Stanley, K., Engel, A., Bruch, V., Rösch, T., Steil, M., and Kaiser-Weiss, A.: Forward modelling of SF6 with ICON-ART, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10768, https://doi.org/10.5194/egusphere-egu26-10768, 2026.

EGU26-11447 | ECS | Posters on site | AS3.38

Satellite-Based Estimation of Nitrous Oxide Concentration and Emission in a Large Estuary 

Wenjie Fan and Zhihao Xu

Estuaries are nitrous oxide (N2O) emission hotspots and play an important role in the global N2O budget. However, the large spatiotemporal variability of emission in complex estuary environments is challenging for large-scale monitoring and budget quantification. This study retrieved water environmental variables associated with N2O cycling based on satellite imagery and developed a machine learning model for N2O concentration estimations. The model was adopted in China’s Pearl River Estuary to assess spatiotemporal N2O dynamics as well as annual total diffusive emissions between 2003 and 2022. Results showed significant variability in spatiotemporal N2O concentrations and emissions. The annual total diffusive emission ranged from 0.76 to 1.09 Gg (0.95 Gg average) over the past two decades. Additionally, results showed significant seasonal variability with the highest contribution during spring (31 ± 3%) and lowest contribution during autumn (21 ± 1%). Meanwhile, emissions peaked at river outlets and decreased in an outward direction. Spatial hotspots contributed 43% of the total emission while covering 20% of the total area. Finally, SHapley Additive exPlanations (SHAP) was adopted, which showed that temperature and salinity, followed by dissolved inorganic nitrogen, were key input features influencing estuarine N2O estimations. This study demonstrates the potential of remote sensing for the estimation of estuarine emission estimations.

How to cite: Fan, W. and Xu, Z.: Satellite-Based Estimation of Nitrous Oxide Concentration and Emission in a Large Estuary, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11447, https://doi.org/10.5194/egusphere-egu26-11447, 2026.

EGU26-11719 | ECS | Posters on site | AS3.38

Using atmospheric observations to identify point sources of halogenated trace gases 

Katharina Meixner, Dominique Rust, Tanja J. Schuck, Thomas Wagenhäuser, Fides Gad, Cedric Couret, Armin Jordan, Martin Vojta, Andreas Stohl, and Andreas Engel and the PARIS project

Measurement-based emission estimates derived from atmospheric observations provide an independent and important approach for identifying emission sources, quantifying emissions and verifying reported inventories. This is particularly relevant for halogenated gases, which due to their role as ozone depleting substances and potent greenhouse gases are regulated under various international and national frameworks. Here, we present two studies highlighting the urgency and the challenges of the measurement-based emission estimates of sulfur hexafluoride (SF6) and fluoroform (HFC-23) with a particular focus on the influence of point sources.

SF6 and HFC-23 are two of the most potent greenhouse gases with a GWP100 of approximately 24,000 and 14,700, respectively. Previous studies consistently showed a dominant emission source in southern Germany contributing to a large share of European SF6 emissions. Meixner et al., 2025 analysed emission estimates based on 22 European measurement sites revealing an underestimated SF6 emission point source in southern Germany in contrast to the national inventory reports.

Recent studies highlighted major challenges in quantifying HFC-23 emissions (Adam et al., 2024; Rust et al., 2024). We investigate the effects of intermittency in emissions and explore different possibilities based on a priori assumptions about specific emission sources. Forward calculations from these potential emission sources are used to derive expected time series at observational sites. These are compared to observations from different European stations situated in the regions influenced by the potential point sources. We present different approaches based on European atmospheric measurements combined with multiple model approaches, including ICON-ART, FLEXPART and NAME.

Adam, B., Western, L.M., Mühle, J., Choi, H., Krummel, P.B., O’Doherty, S., Young, D., Stanley, K.M., Fraser, P.J., Harth, C.M., Salameh, P.K., Weiss, R.F., Prinn, R.G., Kim, J., Park, H., Park, S., Rigby, M., 2024. Emissions of HFC-23 do not reflect commitments made under the Kigali Amendment. Commun. Earth Environ. 5, 783. https://doi.org/10.1038/s43247-024-01946-y

Meixner, K., Wagenhäuser, T., Schuck, T.J., Alber, S., Manning, A.J., Redington, A.L., Stanley, K.M., O’Doherty, S., Young, D., Pitt, J., Wenger, A., Frumau, A., Stavert, A.R., Rennick, C., Vollmer, M.K., Maione, M., Arduini, J., Lunder, C.R., Couret, C., Jordan, A., Gutiérrez, X.G., Kubistin, D., Müller-Williams, J., Lindauer, M., Vojta, M., Stohl, A., Engel, A., 2025. Characterization of German SF6 Emissions. ACS EST Air 2, 2889–2899. https://doi.org/10.1021/acsestair.5c00234

Rust, D., Vollmer, M.K., Henne, S., Frumau, A., van den Bulk, P., Hensen, A., Stanley, K.M., Zenobi, R., Emmenegger, L., Reimann, S., 2024. Effective realization of abatement measures can reduce HFC-23 emissions. Nature 633, 96–100. https://doi.org/10.1038/s41586-024-07833-y

How to cite: Meixner, K., Rust, D., Schuck, T. J., Wagenhäuser, T., Gad, F., Couret, C., Jordan, A., Vojta, M., Stohl, A., and Engel, A. and the PARIS project: Using atmospheric observations to identify point sources of halogenated trace gases, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11719, https://doi.org/10.5194/egusphere-egu26-11719, 2026.

EGU26-11963 | Posters on site | AS3.38

 Bridging Science and National GHG Inventories: Insights from the PARIS Project – Process Attribution of Regional Emissions 

Sylvia Walter, Alistair Manning, Thomas Röckmann, and Anita Ganesan and the PARIS Team

Strengthening the link between scientific research and official greenhouse gas (GHG) reporting is an important step under the Paris Agreement’s Enhanced Transparency Framework. The PARIS Project, funded by Horizon Europe, is working with eight European countries to develop practical tools for this purpose.

A central innovation of PARIS is the development of draft annexes to National Inventory Documents (NIDs). These annexes provide a structured and transparent interface between official bottom-up inventories and top-down atmospheric estimates. They do not alter formal reporting rules; instead, they document how independent scientific assessments compare with inventory estimates, identify consistencies and discrepancies, and highlight where further investigation or methodological development is warranted. In this way, the annexes enable inventory compilers, policymakers, and scientists to interpret atmospheric results within the legal and institutional framework of national reporting.

The annexes are underpinned by major advances in PARIS observation and modelling capacity. Expanded and harmonised networks for CH₄, N₂O, F-gases, and aerosols, together with multi-model inverse systems and common data standards publicly available on the ICOS Carbon Portal, provide robust, traceable estimates of regional emissions and their sectoral drivers. These scientific outputs are synthesised in the annexes in a form that is directly usable by inventory agencies.

Through close engagement with national inventory teams in the UK, Switzerland, Germany, Ireland and other focus countries, PARIS has co-developed annex templates and begun populating them with results from multiple inversion systems. This process reduces barriers between the research and inventory communities and supports routine, transparent comparison of bottom-up and top-down estimates.

The poster will present the main outcomes of the PARIS project, demonstrating how the outcomes advance and embed atmospheric science in national GHG reporting to strengthen confidence in emission estimates, improve process attribution of regional emissions, and ultimately support more effective climate policy under the Paris Agreement.

How to cite: Walter, S., Manning, A., Röckmann, T., and Ganesan, A. and the PARIS Team:  Bridging Science and National GHG Inventories: Insights from the PARIS Project – Process Attribution of Regional Emissions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11963, https://doi.org/10.5194/egusphere-egu26-11963, 2026.

EGU26-12319 | ECS | Orals | AS3.38

The use of CLMS products for improving the spatialization of greenhouse gases emissions from LULUCF and agriculture sectors  

Giulia Cecili, Paolo De Fioravante, Guido Pellis, Marina Vitullo, and Angela Fiore

The Land Use, Land-Use Change, and Forestry (LULUCF) and agriculture sectors are increasingly central to global climate policy. They play a crucial role in climate mitigation strategies, as land acts as a carbon sink that needs to be enhanced and as a source of greenhouse gas (GHG) emissions that must be reduced. In the European context, the LULUCF Regulation (EU 2018/841), revised in 2023, aims for 310 Mt CO2eq net removals by 2030 and requires spatially explicit land-use representations to monitor land dynamics and assess policy impacts.

Within the Horizon project AVENGERS (Attributing and Verifying European and National Greenhouse Gas and Aerosol Emissions and Reconciliation with Statistical Bottom-up Estimates), a methodology was developed to generate an IPCC-compliant land-use map by integrating multiple Copernicus Land Monitoring Service (CLMS) products. In national GHG inventories, the operational use of spatial explicit data is often limited due to restricted temporal coverage, inconsistencies with national statistics, and challenges in interpreting mixed classes and land-use/land cover definitions. This methodology provides a transparent approach to reconcile inventory data with high-resolution spatial datasets.

The approach combines the CLC Plus Backbone geometry with CORINE Land Cover (CLC) and ancillary CLMS datasets, including the High-Resolution Layer Crop Types and Priority Areas monitoring products (e.g., Coastal Zones, Riparian Zones, and Protected areas). Multiple layers were integrated using overlay techniques and priority rules, resulting in an harmonized map at 10-m spatial resolution. CLC attributes were aggregated to IPCC land use categories, allowing direct comparison between mapped areas and inventory surfaces.

Preliminary validation involved cross-checks with national land-use activity data to ensure reliability of mapped areas across LULUCF categories. The resulting maps enable the spatialization of inventory-based LULUCF and agriculture emissions, producing gridded emission datasets based on improved spatially explicit land-use information. These datasets are suitable for use as input (priors) in atmospheric inversion modelling, a top-down emissions estimation method supporting policy evaluation.

The methodology is designed to be replicable across all European countries covered by CLMS data and to be updated approximately every 2–3 years, in line with the regular update cycle of CLMS products. The methodological framework is modular and flexible, based on a spatial data storage and management scheme developed by ISPRA, which allows the integration of additional datasets and adaptation to different territorial contexts. The approach was applied and tested in three national case studies for the year 2018—Italy, Sweden, and the Netherlands—with specific adaptations introduced to account for distinct territorial characteristics. This first implementation represents a promising step and provides a solid foundation for further refinements and future developments, supporting the production of high-resolution land-use maps helpful for national inventory agencies and inversion modelling experts.

How to cite: Cecili, G., De Fioravante, P., Pellis, G., Vitullo, M., and Fiore, A.: The use of CLMS products for improving the spatialization of greenhouse gases emissions from LULUCF and agriculture sectors , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12319, https://doi.org/10.5194/egusphere-egu26-12319, 2026.

EGU26-12745 | ECS | Posters on site | AS3.38

Impact of Local-Scale Effects in Methane (CH₄) Inversions on Model-Observation Discrepancies 

Elena Zwerschke, Frank-Thomas Koch, Christoph Gerbig, Jennifer Mueller-Williams, Matthias Lindauer, Frank Keppler, and Dagmar Kubistin

Accurate estimates of greenhouse gas emissions are critical for determining the effectiveness of mitigation strategies under the Paris Agreement. These estimates are commonly derived by atmospheric inversion frameworks, which combine atmospheric transport models with in situ observations to obtain greenhouse gas fluxes. However, regional inversions are often challenged by local-scale signals in atmospheric measurements, that are insufficiently represented by the models. If not properly accounted for, these can introduce biases in inverse flux estimates undermining the reliability of emission estimates.

To address this limitation, observational data has typically been filtered for local influences before being used in inversion simulations, based on assumptions such as stable boundary conditions or wind speed. To make full use of the available dataset, we implemented an observation-dependent model-data uncertainty in the inversion optimisation process, allowing local signals to be explicitly considered. This approach has been applied to CH4 inversions over Europe using the mesoscale Jena CarboScope-Regional (CSR) system at 0.25° × 0.25° resolution.

To determine the time varying model-data uncertainty based on the local influence signal, a leave-one-out cross validation was performed for ground based in situ data of 47 atmospheric stations, excluding one station per inversion simulation. By determining the difference between modelled and observed concentrations, a model-data mismatch was estimated across station categories defined by surrounding land type. These estimates were then combined with local signal features, resulting from low wind speeds, atmospheric stability, and concentration spikes using a multivariate regression. The derived model-data mismatch function was applied to adjust the data weighting in the inversion enabling the inclusion of the observational dataset without discarding any measurements.

In this presentation, we demonstrate the potential of this novel approach to improve the robustness of regional CH4 inversions and to reduce the bias from local-scale signals.

How to cite: Zwerschke, E., Koch, F.-T., Gerbig, C., Mueller-Williams, J., Lindauer, M., Keppler, F., and Kubistin, D.: Impact of Local-Scale Effects in Methane (CH₄) Inversions on Model-Observation Discrepancies, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12745, https://doi.org/10.5194/egusphere-egu26-12745, 2026.

EGU26-12776 | ECS | Posters on site | AS3.38

Quantifying European SF6 emissions (2005-2021) using a large ensemble of atmospheric inversions 

Martin Vojta, Andreas Plach, Rona L. Thompson, Pallav Purohit, Kieran Stanley, Simon O'Doherty, Dickon Young, Joe Pitt, Jgor Arduini, Xin Lan, and Andreas Stohl

Sulfur hexafluoride (SF₆) is an extremely potent (GWP100 = 24,300) and long-lived greenhouse gas whose atmospheric concentrations continue to rise due to anthropogenic emissions. Europe represents a particularly relevant test case for investigating SF₆ emissions, as successive EU F-gas regulations over the past two decades have aimed to substantially reduce emissions. A key question is whether these regulatory measures are reflected in observed emission trends and whether reported national inventories are consistent with observation-based estimates.

 In this study, we quantify European SF₆ emissions for the period 2005–2021 using a large ensemble of atmospheric inversions with a strong focus on uncertainty characterization. Uncertainties are assessed using an extensive set of sensitivity tests in which key inversion parameters are systematically varied, while final uncertainties are quantified via a Monte Carlo ensemble that randomly samples combinations of these parameters. This allows us to identify the main sources of uncertainty and to evaluate the robustness of inferred emission trends.

Our analysis focuses on countries with relatively dense observational coverage - the United Kingdom, Germany, France, and Italy - while also examining aggregated emissions for the EU-27.  The inversion results reveal declining SF₆ emissions in all studied regions except Italy, broadly consistent with the timing of EU F-gas regulations (842/2006, 517/2014). In several countries, inferred emissions exceed reported national inventories, although the agreement generally improves in more recent years. At the EU-27 scale, emissions exhibit a pronounced decline between 2017 and 2018, coinciding with a marked reduction in emissions from southwestern Germany, suggesting regional actions were taken as the 2014 regulation took effect.

Our sensitivity tests highlight the crucial role of dense and sustained atmospheric monitoring networks for robust inversion-based emission estimates. In particular, expansions of the UK observing system in 2012 and 2014 lead to significant reductions in emission uncertainties, demonstrating the importance of comprehensive observational networks in refining emission estimates.

How to cite: Vojta, M., Plach, A., Thompson, R. L., Purohit, P., Stanley, K., O'Doherty, S., Young, D., Pitt, J., Arduini, J., Lan, X., and Stohl, A.: Quantifying European SF6 emissions (2005-2021) using a large ensemble of atmospheric inversions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12776, https://doi.org/10.5194/egusphere-egu26-12776, 2026.

EGU26-14303 | ECS | Orals | AS3.38

Urban Atmospheric Monitoring and Modeling System (Urban-AMMS): A Top-Down Approach to Investigate Sources and Variability of an Inert Tracer in the Washington, DC, and Baltimore, MD, Metropolitan Area 

Miguel Cahuich-Lopez, Christopher Loughner, Fong Ngan, Anna Karion, Lei Hu, Israel Lopez-Coto, Kimberly Mueller, Julia Marrs, John Miller, Brian McDonald, Colin Harkins, Congmeng Lyu, Meng Li, Kevin Gurney, Sonny Zinn, Xinrong Ren, Mark Cohen, Howard Diamond, Ariel Stein, and James Whetstone

Accurate quantification of the sources and sinks of long-lived air pollutants is fundamental for effective emissions management, particularly in urban areas where emissions are generally more intense. Stakeholders commonly use so-called bottom-up methods to estimate emissions for urban areas. This type of emission accounting is typically carried out for annual totals, often with a latency of one or more years. Alternative methods that provide estimates with higher temporal resolution and lower latency could be helpful for stakeholders seeking targeted strategies to reduce emissions. A top-down urban emissions estimation system for the Washington, DC, and Baltimore, MD, metropolitan area, called the Urban Atmospheric Monitoring and Modeling System (Urban-AMMS), is being developed to provide accurate, up-to-date urban emissions data. Urban-AMMS has several components, including tower-based, aircraft, and mobile van measurements platforms, whose data are assimilated by the CarbonTracker-Lagrange analytical inverse model; an ensemble of HYSPLIT backward dispersion simulations driven by in-house high-resolution WRF simulations (spatial resolution of 1 km) enhanced with urban meteorological observations; biospheric models; and bottom-up inventories used for a prior estimate of emissions in the domain. The inversion system is tailored to account for the underlying variability in urban fluxes of an inert tracer (CO2) by solving for hourly fluxes and incorporating explicit spatiotemporal covariance of prior errors, as well as high-resolution source-receptor sensitivities estimated by WRF-HYSPLIT. Here, we present an overview of Urban-AMMS, including initial results and sensitivity analyses to investigate the effects of prior spatial aggregation, background handling, and the temporal covariance of prior errors. Numerical experiments show improvements in estimates of urban surface fluxes at both the city and grid cell scales. Still, the reliability of inverse fluxes depends on prior uncertainty, as observed in previous studies. These findings provide critical insights for the inverse estimation of long-lived air pollutants in complex urban environments.

How to cite: Cahuich-Lopez, M., Loughner, C., Ngan, F., Karion, A., Hu, L., Lopez-Coto, I., Mueller, K., Marrs, J., Miller, J., McDonald, B., Harkins, C., Lyu, C., Li, M., Gurney, K., Zinn, S., Ren, X., Cohen, M., Diamond, H., Stein, A., and Whetstone, J.: Urban Atmospheric Monitoring and Modeling System (Urban-AMMS): A Top-Down Approach to Investigate Sources and Variability of an Inert Tracer in the Washington, DC, and Baltimore, MD, Metropolitan Area, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14303, https://doi.org/10.5194/egusphere-egu26-14303, 2026.

EGU26-14957 | Orals | AS3.38

Quantifying N₂O Flux over the EU27+3 Region Using CIF-CHIMERE Model for 2005–2023 

Tianqi Shi, Antoine Berchet, and Philippe Ciais

Nitrous oxide (N₂O) is the third most important long-lived greenhouse gas after CO₂ and CH₄, yet large uncertainties remain in its regional emission estimates. In this study, we apply the regional inverse modeling system CIF-CHIMERE to quantify N₂O surface fluxes over the EU27+3 region (European Union, United Kingdom, Norway, and Switzerland) for the period 2005–2023, providing a long-term and high spatiotemporal resolution assessment of N2O fluxes. The inversion is primarily constrained by in situ atmospheric N₂O measurements from the ICOS (Integrated Carbon Observation System) ground-based station network across Europe, and uses the CIF-CHIMERE transport model coupled with a four-dimensional variational (4D-Var) data assimilation framework to estimate posterior N2O fluxes. For 2005–2023, inversions are conducted at a spatial resolution of 0.5° × 0.5°, while for 2018–2023 the resolution is refined to 0.2° × 0.2°. In both configurations, hourly surface fluxes are estimated, enabling analysis of diurnal, seasonal, and interannual variability. The inversions significantly improve the representation of localized emission patterns and short-term flux dynamics. Overall, the results provide a top-down dataset for evaluating bottom-up inventories and for improving the understanding of regional and temporal variability in N₂O emissions across EU27+3.

How to cite: Shi, T., Berchet, A., and Ciais, P.: Quantifying N₂O Flux over the EU27+3 Region Using CIF-CHIMERE Model for 2005–2023, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14957, https://doi.org/10.5194/egusphere-egu26-14957, 2026.

EGU26-15692 | ECS | Posters on site | AS3.38

CH4 emissions in Vietnamese Rice Agriculture: Benchmarking process-based model approaches (Tier 3) against Tier 1/2 Estimates 

Chien Nguyen, David Kraus, Tanh Nguyen, Reiner Wassmann, Klaus Butterbach-Bahl, Thi Bach Thuong Vo, Van Trinh Mai, Thi Phuong Loan Bui, and Ralf Kiese

Rice cultivation is the largest source of methane (CH4) emissions in Vietnam’s agricultural sector, making accurate quantification of these emissions critical for national GHG inventories and the design of mitigation policies. Currently, for UNFCCC GHG reporting, Vietnam primarily employs IPCC Tier 2 approaches using national emission factors combined with Tier 1 scaling factors. With the implementation of large-scale mitigation projects and Vietnam’s ambition to achieve Net Zero by 2050, Methane Global Pledge commitment by 2030, and joining international carbon markets, there is an urgent need to transition towards higher-tier methodologies. However, also process-based model (Tier 3) outputs are associated with uncertainty, which needs to be benchmarked first with established Tier 1 and 2 emission estimates.

In this study, CH4 emission data from 13 Vietnamese field experiments are split into two groups—one with comprehensive management information (sufficient data) and one with sparse information (limited data)—to test IPCC Tier methods under different activity data conditions. Furthermore, for Tier 3, an inter-comparison is conducted between two biogeochemical models, DNDC and LandscapeDNDC. The evaluation focuses on the performance in estimating rice yields, seasonal CH4 emissions, and daily flux dynamics, while also analyzing the impact of different model parameterization and simulation setups.

Our evaluation shows that Tier 1 significantly underestimates CH4 emissions, whereas Tier 2 provides a substantial improvement and remains robust across varying soil and management conditions. In contrast, Tier 3 outperforms Tier 2 only when comprehensive management data is available, reflecting its distinctive capacity to represent daily emission dynamics and management-driven peaks.  Consequently, while Tier 2 remains a practical choice for national inventories, Tier 3 is essential for high-resolution mitigation assessments, particularly for large-scale emission reduction evaluations where detailed management data are comprehensively collected and systematically organized. The process-based model comparison reveals that while DNDC and LandscapeDNDC show similar performance under continuous flooding, they diverge significantly under Alternate Wetting and Drying (AWD) regimes. These discrepancies are primarily attributed to the models' different concepts of representing water table fluctuations.

Building on these results, the Tier 3 approach of LandscapeDNDC was integrated into the web‑based LUI‑RICE platform (https://ldndc.online/rice/). This makes GHG quantification for Vietnamese rice cultivation directly accessible to local stakeholders and policymakers, translating the scientific findings of this study into a practical decision-support application.

How to cite: Nguyen, C., Kraus, D., Nguyen, T., Wassmann, R., Butterbach-Bahl, K., Vo, T. B. T., Mai, V. T., Bui, T. P. L., and Kiese, R.: CH4 emissions in Vietnamese Rice Agriculture: Benchmarking process-based model approaches (Tier 3) against Tier 1/2 Estimates, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15692, https://doi.org/10.5194/egusphere-egu26-15692, 2026.

EGU26-15734 | Orals | AS3.38

ΔXCO/ΔXCO2 characteristics over coal-fire areas in Xinjiang, China using a portable EM27/SUN FTIR spectrometer 

Qiansi Tu, Jiaxin Fang, Frank Hase, André Butz, África Barreto, Omaira García, and Kai Qin

Long-term coal spontaneous combustion (CSC) represents a severe and persistent threat, resulting in substantial waste of energy resources, significant environmental degradation, and serious risks to human health and safety. To better understand the emission characteristics of CSC, we conducted ground-based measurements of XCO₂, XCH₄, XCO and aerosol optical depth (AOD) using a Fourier-transform infrared spectrometer (EM27/SUN) within the COCCON network, in the Wugonggou coal-fire region near Fukang, Xinjiang.

Our results indicate that TROPOMI satellite data systematically underestimated XCO, with a mean bias of 4.53 ± 5.53 ppb (4.54%). For distinct enhancement events observed by COCCON, ΔXCO₂ and ΔXCO exhibit a strong correlation (R² = 0.6082), with a slope of 9.782 ppb/ppm (9.782 × 10⁻³ ppm/ppm). This value is lower than the CAMS inventory ratio of 13.52 × 10⁻³. This discrepancy arises primarily from their distinct spatial representativeness. The COCCON instrument, located within the coal fire region, captures intense local combustion emission. In contrast, the CAMS product represents a daily average over a much larger model grid cell, which dilutes strong local point sources like coal fires within a broader regional background. Additionally, correlation analysis shows that ΔXCO is more closely linked to AOD (R² = 0.2283) than either ΔXCO₂ or ΔXCH₄, underscoring the distinct behavior of CO in coal-fire plumes.

How to cite: Tu, Q., Fang, J., Hase, F., Butz, A., Barreto, Á., García, O., and Qin, K.: ΔXCO/ΔXCO2 characteristics over coal-fire areas in Xinjiang, China using a portable EM27/SUN FTIR spectrometer, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15734, https://doi.org/10.5194/egusphere-egu26-15734, 2026.

EGU26-16089 | Orals | AS3.38

National-scale methane emissions in South Korea (2010–2021): insights from multiple inversion systems  

Samuel Takele Kenea, Daegeun Shin, Wonick Seo, Sunran Lee, Fenjuan Wang, Shamil Maksyutov, Rajesh Janardanan, Soojeong Lee, Dmitry A. Belikov, Prabir K. Patra, Nicole Montenegro, Antoine Berchet, Marielle Saunois, Adrien Martinez, Ruosi Liang, Yuzhong Zhang, Ge Ren, Hong Lin, Sara Hyvärinen, and Aki Tsuruta and the Sangwon Joo, Sumin Kim

Accurate estimation of methane (CH₄) emissions is essential for assessing mitigation progress, 
yet substantial uncertainties persist at the national scale. In South Korea, CH₄ emissions are 
predominantly anthropogenic, with the waste and agricultural sectors contributing 
approximately 82% of total national emissions. This study analyzes national-scale CH₄ 
emission estimates for South Korea during 2010–2021 using multiple atmospheric inversion 
systems participating in the Methane Inversion Inter-Comparison for Asia (MICA) project. 
Results from inversions using only in situ observations indicate that prior emissions over South 
Korea were likely overestimated. Prior estimates range from 1.5 to 1.7 Tg yr⁻¹ for most years, 
whereas posterior emissions are, on average, about 15% lower than the prior estimates. A 
notable exception is the LMDZ inversion model, which yields posterior estimates that are 40
67% lower than prior values. This substantial reduction is primarily associated with the waste 
sector. Sectoral attribution reveals substantial inter-model differences. LMDZ shows a 
decreasing waste-sector emission trend in Exp. 1 but an increasing trend when only satellite 
observations are assimilated (Exp. 2), whereas the STILT-based inversion consistently 
indicates increasing waste-sector emissions. Given that the waste sector dominates national 
CH₄ emissions, these discrepancies strongly influence total emission estimates. The prior 
waste-sector emissions, derived from EDGAR v7, exceed those reported in South Korea’s 
national greenhouse gas inventory (GIR), contributing to the observed overestimation. 
Additionally, the inversion-derived posterior estimates consistently indicate an overestimation 
of prior agricultural emissions during the summer months. Model performance evaluation over 
the region of interest indicates varying levels of agreement between simulated and observed 
CH₄ mole fractions, with correlation coefficients ranging from 0.24 to 0.85 and posterior biases 
ranging from −65.6 to 0.34 ppb, highlighting the choice of transport model is important. Overall, 
this study highlights the value of multi-model inversion inter-comparisons for constraining 
national-scale CH₄ emissions, diagnosing sector-specific uncertainties, and identifying 
structural differences among inversion frameworks that can guide future improvements. 

How to cite: Takele Kenea, S., Shin, D., Seo, W., Lee, S., Wang, F., Maksyutov, S., Janardanan, R., Lee, S., Belikov, D. A., Patra, P. K., Montenegro, N., Berchet, A., Saunois, M., Martinez, A., Liang, R., Zhang, Y., Ren, G., Lin, H., Hyvärinen, S., and Tsuruta, A. and the Sangwon Joo, Sumin Kim: National-scale methane emissions in South Korea (2010–2021): insights from multiple inversion systems , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16089, https://doi.org/10.5194/egusphere-egu26-16089, 2026.

EGU26-16769 | Posters on site | AS3.38

Improving the Accuracy of CO₂ Emission Estimates over South Korea Using a Top-down Inversion Framework 

Ho Yeon Shin, Daegeun Shin, Samuel Takele Kenea, Sunran Lee, Sumin Kim, and Yun Gon Lee

The international community has continuously monitored carbon emissions by publishing National Inventory Reports (NIRs) under the Paris Agreement adopted in 2015 to address the climate crisis. However, current emission estimation methods predominantly rely on bottom-up approaches based on statistical information, which are subject to limitations, including the potential omission of emission sources and the long time required for emission compilation. To overcome these limitations, top-down approaches that estimate emissions using meteorological models and observed atmospheric greenhouse gas concentrations have recently gained increasing attention. This approach has been adopted as a scientific methodology of the Integrated Global Greenhouse Gas Information System (IG3IS), developed under the auspices of the World Meteorological Organization (WMO), and is regarded as a complementary alternative to conventional emission inventories. In this study, carbon dioxide (CO₂) emissions over South Korea were estimated using a top-down approach based on the Stochastic Time-Inverted Lagrangian Transport Model (STILT) and observations from WMO/Global Atmosphere Watch (GAW) stations, and their accuracy was evaluated. The STILT-based inversion results indicate that anthropogenic CO₂ emissions in South Korea for 2019 amount to 589.7 Mt yr⁻¹, which is 83.6 Mt yr⁻¹ lower than the estimate reported in the existing NIR. The downward correction is primarily concentrated in Seoul and the surrounding metropolitan region. Furthermore, to account for the spatial characteristics of CO₂ emission distributions, high-resolution and realistic emission estimates were derived for regions with dense point-source emissions using the Weather Research and Forecasting (WRF) model. The application of top-down approaches for greenhouse gas emission estimation in East Asian countries, together with continuous technological advancement, is expected to provide a scientific foundation for improving the reliability of emission estimates and supporting future climate crisis response strategies.

How to cite: Shin, H. Y., Shin, D., Kenea, S. T., Lee, S., Kim, S., and Lee, Y. G.: Improving the Accuracy of CO₂ Emission Estimates over South Korea Using a Top-down Inversion Framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16769, https://doi.org/10.5194/egusphere-egu26-16769, 2026.

EGU26-17209 | Posters on site | AS3.38

Development and Application of a Cryogenic Preconcentration System for Halogenated Greenhouse Gas Measurements in Korea 

Joo-Ae Kim, Sunggu Kang, Dohyun Kwon, Sunyoung Park, Soojeong Lee, and Sumin Kim

East Asia represents a major source region of greenhouse gas emissions associated with rapid industrialization and increasing energy demand. Among these emissions, halogenated synthetic greenhouse gases such as HFCs and PFCs, which have been widely used as substitutes following international regulations for ozone layer protection, are characterized by high global warming potentials (GWPs).

In South Korea, halogenated greenhouse gases have been monitored at the Gosan station on Jeju Island using the MEDUSA system of the AGAGE network.  However, the expansion of observational coverage and the establishment of measurement capabilities remain essential to better characterize regional emission signals.  In this study, a cryogenic preconcentration and analysis capability for halogenated greenhouse gases (NIMS-preconcentrator) was developed and and evaluate its capability for monitoring halogenated greenhouse gases.

The analytical setup includes a cryogenic thermal desorption (TD) unit and a pre-concentration trap capable of reaching temperatures down to −170 °C, integrated with an automated valve control module and gas chromatography–mass spectrometry (GC–MS). Measurements were conducted using an offline canister-based sampling approach. Analysis of ambient air samples collected at Anmyeondo (GAW station) resolved about ten halogenated greenhouse gas species, including HFC-134a, HFC-125, and legacy chlorofluorocarbons such as CFC-11 and CFC-12. Concentrations were evaluated using calibration standards, and ongoing performance assessment is conducted using laboratory working standards employed at the Gosan AGAGE station.

This study aims to establish a new measurement capability for halogenated greenhouse gases and to assess its consistency with international observation. Continued operation of this system will support the accumulation of long-term observational datasets and facilitate regional-scale analysis and inter-comparison of high-GWP halogenated greenhouse gases in Northeast Asia.

How to cite: Kim, J.-A., Kang, S., Kwon, D., Park, S., Lee, S., and Kim, S.: Development and Application of a Cryogenic Preconcentration System for Halogenated Greenhouse Gas Measurements in Korea, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17209, https://doi.org/10.5194/egusphere-egu26-17209, 2026.

EGU26-17591 | Posters on site | AS3.38

High-resolution direct GHG emission estimation and simulation from residential space heating using open data  

Kirsten v. Elverfeldt, Gefei Kong, Veit Urlich, Maria Martin, Moritz Schott, and Sebastian Block

Residential space heating remains a major source of greenhouse gas emissions in the building sector. In Germany, space heating accounts for the largest share of residential energy consumption, and accurate quantification of associated emissions is essential to meet national climate mitigation targets.

Most research on residential heating emissions focuses on the regional or national levels, while estimates at finer spatial scales remain limited. Data availability further constrains the transferability and usability of current models. Consequently, approaches that deliver spatially and temporally detailed emission estimates and interactive tools to support analysis and decision-making by stakeholders are urgently needed.

We introduce the Climate Action Navigator (CAN), a dashboard for the analysis and visualization of climate mitigation and adaptation spatial data, based entirely on open science principles. One of the tools available in the CAN estimates carbon dioxide emissions from residential heating at fine spatial at temporal scales. The tool applies a bottom-up accounting methodology at 100 m spatial resolution based on publicly available census and building characteristics data in Germany, including building age and dominant energy carriers. The resulting emission estimates are consistent with official city- and national-level inventories, confirming methodological reliability. Germany-wide analyses reveal strong spatial heterogeneity in energy consumption and emissions that correlate with urban morphological characteristics.

Temporal dynamics are captured through an hourly simulation using the Demand Ninja model based on local weather data. The resulting temporal emission patterns can support inverse emission modelling applications as well as aid energy management by, for example, revealing peak heating demand times and locations.

Results are delivered via the CAN interface as intuitive, interactive maps and charts that allow users to compare across neighborhoods, explore temporal emission dynamics, and assess potential mitigation actions. By integrating open-source data with high-resolution modeling and visualization, the Climate Action Navigator bridges the gap between scientific emission quantification and practical decision making. The approach supports transparent attribution and tracking of residential space-heating emissions, thereby advancing evidence-based climate mitigation planning.

How to cite: v. Elverfeldt, K., Kong, G., Urlich, V., Martin, M., Schott, M., and Block, S.: High-resolution direct GHG emission estimation and simulation from residential space heating using open data , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17591, https://doi.org/10.5194/egusphere-egu26-17591, 2026.

EGU26-19274 | ECS | Posters on site | AS3.38

Carbon dioxide and methane emissions from a network of thirty eddy-covariance sites in the Netherlands 

Ignacio Andueza Kovacevic, Laurent Bataille, Isabel Cabezas, Freek Engel, Wietse Franssen, Corine van Huissteden, Ronald Hutjes, Ruchita Ingle, Wilma Jans, Tan JR Lippmann, Jeferson Zerrudo, Hong Zhao, Reinder Nouta, and Bart Kruijt

Understanding the temporal dynamics and controls on greenhouse gas exchange between terrestrial ecosystems and the atmosphere is critical for advancing process-level understanding and informing national greenhouse gas budgets and inventories. A large portion of soils in the Netherlands are either drained or restored peatlands, where the high carbon/organic matter content is accompanied by large risk of carbon loss to the atmosphere through enhanced soil respiration (drained sites) and/or enhanced methane emissions (rewetted sites). For this reason, increasing attention is being paid to understanding and quantifying the greenhouse gas budgets of both drained and restored peatland sites across the Netherlands. 
 
To both inform national GHG inventories and improve our understanding of site scale process, we present a multi-site analysis of a network of more than thirty eddy-covariance sites in the Netherlands. We discuss the daily, seasonal, and annual variability of carbon dioxide (CO₂) and methane (CH₄) fluxes measured at these sites. These sites include intensively managed grasslands, arable fields, semi-natural pastures, forested peatlands, wetlands and marshes. These sites encompass a wide range of vegetation types, soil characteristics, and water-management practices, with continuous or semi-continuous high-frequency flux datasets extending across multiple years within the last decade.
 
We quantify daily, seasonal, and annual CO₂ and CH₄ fluxes and discuss key biophysical drivers, including soil composition and moisture, vegetation dynamics, groundwater levels, and the impacts of climate anomalies such as temperature and precipitation extremes across varying timescales. We discuss differences between sites and potential impacts of soil characteristics, vegetation, land management, and recent climate anomalies.
 
Our analysis indicates substantial variability in both CO₂ and CH₄ fluxes across sites and seasons. These results highlight the invaluable contributions of both high-resolution flux observations and rigorous data processing methods when disentangling ecosystem controls on gas exchange. These flux observations provide much needed empirical constraints for model evaluation and can facilitate improved representation of peatland and wetland systems in greenhouse gas inventories and process-based models.

How to cite: Andueza Kovacevic, I., Bataille, L., Cabezas, I., Engel, F., Franssen, W., van Huissteden, C., Hutjes, R., Ingle, R., Jans, W., Lippmann, T. J., Zerrudo, J., Zhao, H., Nouta, R., and Kruijt, B.: Carbon dioxide and methane emissions from a network of thirty eddy-covariance sites in the Netherlands, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19274, https://doi.org/10.5194/egusphere-egu26-19274, 2026.

EGU26-19514 | Orals | AS3.38

Towards accurate quantification of New Zealand’s methane emissions from waste and agriculture 

Peter Sperlich, Christian Stiegler, Alex Geddes, Hamish Sutton, Brendon Smith, Molly Leitch, Sally Gray, Gordon Brailsford, Rowena Moss, Beata Bukosa, Sara Mikaloff-Fletcher, Amir Pirooz, Richard Turner, Jocelyn Turnbull, Johannes Laubach, Suzanne Rowe, Lorna McNaughton, Olivia Spaans, Kevan Brian, and Ellen Wymei

Methane emissions from waste and agriculture account for 46.6 % of Aotearoa New Zealand’s (ANZ) gross greenhouse gas emissions in 2023. Despite the significance of methane emissions, the only way to estimate their magnitude is based on emission factor methods, which include large uncertainties.  We present newly developed tools to directly measure methane emissions from wastewater treatment facilities, animal effluent storage systems and herds of dairy cows. We deploy in situ analysers on mobile observation platforms (vehicle and drone) and quantify methane emission fluxes using the tracer gas technique.  The accuracy of this method is estimated in multiple ways: i) a controlled release experiment, ii) through comparison to a mass-balance modelling approach, iii) through comparison to co-located chamber measurements for methane emissions from effluent ponds, iv) through comparison to co-located measurements of animal emissions using the “GreenFeed” technique. The comparisons show excellent agreement, providing much needed assurance of analytical performance to our mobile techniques. Our tools support ANZ’s farmers and waste managers to better understand current emissions, as well as to assess the efficacy of investments into emission mitigation. Additional tests explore new isotope techniques with the goal to quantify methane fluxes from different components within a plant, for example methane derived from digestors versus methane derived from biosolids in wastewater treatment systems, or methane from the open face of a landfill versus emissions from an area that is covered.

How to cite: Sperlich, P., Stiegler, C., Geddes, A., Sutton, H., Smith, B., Leitch, M., Gray, S., Brailsford, G., Moss, R., Bukosa, B., Mikaloff-Fletcher, S., Pirooz, A., Turner, R., Turnbull, J., Laubach, J., Rowe, S., McNaughton, L., Spaans, O., Brian, K., and Wymei, E.: Towards accurate quantification of New Zealand’s methane emissions from waste and agriculture, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19514, https://doi.org/10.5194/egusphere-egu26-19514, 2026.

EGU26-19832 | ECS | Orals | AS3.38

Can observation-based atmospheric mixing state reduce filtering sensitivity in GHG inversions? Lessons from the UK GEMMA programme 

Dafina Kikaj, Peter Andrews, Alexandre Danjou, Alistair Manning, Matt Rigby, Ed Chung, Grant Forster, Angelina Wenger, Chris Rennick, Emmal Safi, Simon O’Doherty, Kieran Stanley, Joe Pitt, and Tom Gardiner

Uncertainty in atmospheric transport models, especially boundary-layer mixing and turbulence, still limits confidence in top-down GHG emission estimates. In inversion workflows, observation selection is commonly supported by empirically tuned filters based on modelled meteorological variables (e.g., boundary-layer height, wind speed). The selection prioritises periods when transport is expected to be well represented. This motivates continued work to characterise atmospheric mixing and its associated uncertainties using observations.

In the UK GEMMA programme, we investigate whether observation-based atmospheric mixing state can provide complementary information to support uncertainty characterisation in UK CH₄ inversions. We demonstrate the framework at UK sites with radon measurements and at a newly instrumented site in Scotland where only meteorological measurements are available. Where radon is measured, we use it as an independent tracer of near-surface mixing and compare observed radon with radon simulated using the Met Office NAME dispersion model and a radon flux map. This comparison is used to define transport-performance classes (periods of relatively better vs poorer agreement) and associated atmospheric mixing state. At the Scotland site, we derive atmospheric mixing regimes from in situ meteorological measurements alone, using a vertical profile sampled every 10 m to characterise stratification and mixing.

We show how the resulting atmospheric mixing state and transport-performance classes can be used in two operational ways: (i) as additional information to support observation selection alongside existing practice, and (ii) to define regime-dependent uncertainty characterisation within inversion frameworks rather than assuming a single fixed error model. We illustrate the approach using two UK CH₄ inverse methods (InTEM and RHIME) and discuss how observation-based mixing information can improve transparency and reproducibility in hybrid (inventory + atmospheric) emissions estimation for IG3IS-aligned information services.

How to cite: Kikaj, D., Andrews, P., Danjou, A., Manning, A., Rigby, M., Chung, E., Forster, G., Wenger, A., Rennick, C., Safi, E., O’Doherty, S., Stanley, K., Pitt, J., and Gardiner, T.: Can observation-based atmospheric mixing state reduce filtering sensitivity in GHG inversions? Lessons from the UK GEMMA programme, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19832, https://doi.org/10.5194/egusphere-egu26-19832, 2026.

EGU26-20089 | Posters on site | AS3.38

From GHG Observations to Actionable Climate Information Services 

Daphne Kitsou, Parakevi Chantzi, Dimitrios Gkoutzikostas, Vasileios Rousonikolos, Georgios Galanis, Argiro Papastergiou, and Georgios Zalidis

Effective climate mitigation requires obtaining greenhouse gas (GHG) information and accounting that is scientifically robust and actionable for decision-making. The CARBONICA project has developed and implemented a robust climate-positive action plan for carbon farming implementation across the widening countries of Greece, Cyprus and North Macedonia, generating climate information services that operate at regional, national, and international scales. An extended management practices inventory has been developed and implemented in pilot sites across 15 crops between the 3 countries, fully aligned with the IPCC, the Natural Climate Solutions World Atlas, the GHG Protocol, and climate related EU laws and initiatives. GHG accounting is supported by a robust MRV system combining soil sampling, field inputs following IPCC Scope guidance, and management practices, covering direct, indirect, and upstream emissions across the farm system, with all procedures are fully compliant with ISO 14064-2. Farm-level data are also collected using the validated Field Diagnostic Toolbox, which includes soil CO₂ flux monitoring using spectroscopy to support accurate assessment of emissions and carbon removals.

This enables explicit attribution of emissions and carbon removals to farms, regions, and in general, the agrifood sector, supporting monitoring, reporting and validating of mitigation measures for positive climate action. LCA modelling on a pilot site (1ha peach orchard) has shown significant results in emissions reductions and carbon removals. The model was used once on the baseline (business-as-usual scenario) in 2024, and once after the management practices no- till and residues incorporation were implemented in the orchard, for the year 2025. The total greenhouse gas emissions from the pilot peach orchard decreased from 2,660 kg CO₂e in 2024 to 1,280 kg CO₂e in 2025, with emissions per ton of produced fruit dropping from 147.63 kg CO₂e to 71.04 kg CO₂e. Beyond the reduction of the emission sources, the demonstrated change in the soil carbon stock was also significant. While the 2024 cultivation season showed a net-zero change compared to the baseline scenario, the implementation of no-till and crop residue incorporation during the 2025 season created an active carbon sink, resulting in a net removal of 597.76 kg of CO₂e from the atmosphere into the soil. Thus, the project successfully demonstrated a twofold climate benefit: a major reduction in operation emissions and a significant sequestration of atmospheric carbon into the soil.

The results presented above are part of a third-party validated carbon farming project, facilitated through CARBONICA. This work also contributes to IG3IS-aligned applications demonstrating the operational use of multi-source GHG observations for real-world solutions in carbon farming.

How to cite: Kitsou, D., Chantzi, P., Gkoutzikostas, D., Rousonikolos, V., Galanis, G., Papastergiou, A., and Zalidis, G.: From GHG Observations to Actionable Climate Information Services, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20089, https://doi.org/10.5194/egusphere-egu26-20089, 2026.

EGU26-20826 | ECS | Orals | AS3.38

Daily and 1/16 degree maps of CO2 fossil fuel emissions based on satellite retrievals of pollutant atmospheric data 

Alexandre Héraud, Frédéric Chevallier, Grégoire Broquet, Philippe Ciais, Adrien Martinez, and Anthony Rey-Pommier

In the context of the Paris Agreement on climate change and of a global effort to reduce greenhouse gas emissions, the monitoring of anthropogenic carbon dioxide (CO2) emissions is needed to assist policy makers but represents a major challenge. While current inventories provide rather robust annual emission totals at country scale, they lag behind real time by many months and they lack spatial and sub-annual details. Here we map the daily surface fossil fuel CO2 emissions at a 1/16 degree resolution over Europe, with the year 2021 as an example, based on spaceborne atmospheric composition observations.

As the high-resolution satellite monitoring of atmospheric CO2 remains challenging, especially at a local spatial scale and a daily time scale, we take advantage of the co-emission of CO2 and nitrogen oxides (NOX) during fossil fuel combustion: we exploit images of nitrogen dioxide (NO2) concentrations retrieved from the measurements of the Tropospheric Monitoring Instrument (TROPOMI) onboard the Sentinel-5P satellite.

From the TROPOMI NO2 concentrations, we retrieve daily maps of NOX emissions based on the divergence of the mass fluxes within the NO2 images. We combine the changes of these maps from one year to the next with low latency national CO2 emissions from Carbon Monitor (https://carbonmonitor.org/), and with a baseline of monthly spatially-distributed CO2 emissions for a previous year (here 2020) from GridFED (https://mattwjones.co.uk/co2-emissions-gridded/) from which we removed aviation and shipping emissions beforehand.

The resulting maps of emission increments from 2020 to 2021 capture changes in highly emitting areas: major urban or industrial areas, and main transport corridors. The emissions for the year 2021 show good consistency with existing inventories. The dataset also produces realistic seasonal variability at a local scale and captures daily variability, although temporally smoothed due to a 5-day rolling average of Carbon Monitor data.

This method is both temporally and spatially scalable and can therefore be extended to the entire world and to additional years, which provides encouraging prospects for the continuation of this work.

How to cite: Héraud, A., Chevallier, F., Broquet, G., Ciais, P., Martinez, A., and Rey-Pommier, A.: Daily and 1/16 degree maps of CO2 fossil fuel emissions based on satellite retrievals of pollutant atmospheric data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20826, https://doi.org/10.5194/egusphere-egu26-20826, 2026.

Quantitative evidence is increasingly required to assess the mitigation potential of cities in achieving global carbon neutrality. However, although urban green spaces contribute simultaneously through biophysical carbon sequestration and reductions in energy demand driven by urban heat island mitigation, few studies have systematically compared and evaluated these two effects within a unified framework at the global scale.This study quantifies the total contribution of urban green spaces to carbon neutrality across global cities and decomposes this contribution into carbon sequestration and cooling driven energy savings, assessing their relative importance and spatial patterns.The urban heat island effect is estimated using remote sensing derived land surface temperature differences between urban and non urban areas, while carbon sequestration by urban green spaces is simultaneously quantified based on satellite based observations.These two contributions are then integrated and compared. Furthermore, this study examines how the relative importance of the two effects varies across major climate zones and how heterogeneity manifests in distinct spatial patterns. Finally, this study investigates how vegetation related indicators, socio economic variables, and urban structural characteristics influence the two effects across climate zones with AI based approaches and identify contextual conditions under which the mitigation benefits of urban green spaces are amplified or attenuated even under similar urban green space availability.This study provides a global assessment of the contribution of urban green spaces to carbon neutrality and offers empirical evidence to support the design of climate and context specific nature based mitigation strategies in cities.

How to cite: Kim, S. and Choi, Y.: The dual role of urban green spaces in carbon neutrality: carbon sequestration and cooling driven energy savings at the global scale, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20959, https://doi.org/10.5194/egusphere-egu26-20959, 2026.

EGU26-22515 | Orals | AS3.38 | Highlight

Assessing the accuracy of the Climate Trace global vehicular and power plant CO2 emissions 

Kevin Gurney, Bilal Aslam, Pawlok Dass, Lech Gawuc, Toby Hocking, Jarrett Barber, and Anna Kato

Accurate estimation of greenhouse gas (GHG) emissions at the infrastructure scale remains essential to climate science and policy applications. Powerplant and vehicle emissions often form the majority of fossil fuel CO2 (FFCO2) emissions in much of the world at multiple scales. Climate Trace, co-founded by former U.S. Vice President Al Gore, is a new AI-based effort to estimate pointwise and roadway-scale GHG emissions, among other sectors. However, limited independent peer-reviewed assessment has been made of this dataset. Here, we update a previous analysis of Climate Trace powerplant FFCO2 emissions in the U.S. and present a new analysis of Climate Trace urban on-road CO2 emissions in U.S. urban areas. This is done through comparison to an atmospherically calibrated, multi-constraint estimates of powerplant and on-road CO2 emissions from the Vulcan Project (version 4.0).

Across 260 urban areas in 2021, we find a mean relative difference (MRD) of 69.9% in urban inroad FFCO2 emissions. Furthermore, differing versions of the Climate Trace on-road emissions releases shift from over to under-estimation in almost equal magnitudes. These large differences are driven by biases in Climate Trace’s machine learning model, fuel economy values, and fleet distribution values. An update to the powerplant FFCO2 emissions analysis (from a 2024 paper) show both improved and degraded convergence of emissions. We continue to recommend that sub-national policy guidance or climate science applications using the GHG emissions estimates in these sectors made by Climate Trace should be done so with caution.

How to cite: Gurney, K., Aslam, B., Dass, P., Gawuc, L., Hocking, T., Barber, J., and Kato, A.: Assessing the accuracy of the Climate Trace global vehicular and power plant CO2 emissions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22515, https://doi.org/10.5194/egusphere-egu26-22515, 2026.

ESSI6 – Short Courses and Education Sessions

CC BY 4.0